ChatBot with RAG on PDF
Published:
🔗 Deployed Link: ChatBot with RAG on PDF
Tech Stack
- Python
- Streamlit
- LangChain
- GROQ
- LangSmith
Features
Document Ingestion
- Accepts one or more PDFs as input.
- Uses LangChain document loaders to split and clean the text.
Embedding & Vector Store
- Generates embeddings using Hugging Face models.
- Stores and indexes them using FAISS vector database for efficient retrieval.
RAG Pipeline
- Retrieves relevant document chunks based on user queries.
- Feeds the context into an open-source LLM for grounded answer generation.
- Powered by GROQ for fast and accurate response.
User Interface
- Built with Streamlit for seamless UX.
- Users can upload PDFs and ask contextual questions.
Observability & Monitoring
- Integrated with LangSmith for pipeline debugging and performance tracking.
Workflow
PDF Upload
Users upload documents; text is extracted and preprocessed.Embedding & Storage
Content is chunked and converted into embeddings using Hugging Face models, then stored in FAISS.Query Handling
User queries are matched against the vector store to fetch the most relevant chunks.Answer Generation
Retrieved context is passed into an LLM, and the generated answer is returned to the user.
Highlights
- 📄 Multi-PDF ingestion
- 🔍 RAG-powered contextual answers
- ⚡ Fast inference using GROQ and FAISS
- 📈 Observability via LangSmith