ChatBot with RAG on PDF

Published: December 09, 2025

🔗 Deployed Link: ChatBot with RAG on PDF

Tech Stack

Python
Streamlit
LangChain
GROQ
LangSmith

Features

Document Ingestion

Accepts one or more PDFs as input.
Uses LangChain document loaders to split and clean the text.

Embedding & Vector Store

Generates embeddings using Hugging Face models.
Stores and indexes them using FAISS vector database for efficient retrieval.

RAG Pipeline

Retrieves relevant document chunks based on user queries.
Feeds the context into an open-source LLM for grounded answer generation.
Powered by GROQ for fast and accurate response.

User Interface

Built with Streamlit for seamless UX.
Users can upload PDFs and ask contextual questions.

Observability & Monitoring

Integrated with LangSmith for pipeline debugging and performance tracking.

Workflow

PDF Upload
Users upload documents; text is extracted and preprocessed.
Embedding & Storage
Content is chunked and converted into embeddings using Hugging Face models, then stored in FAISS.
Query Handling
User queries are matched against the vector store to fetch the most relevant chunks.
Answer Generation
Retrieved context is passed into an LLM, and the generated answer is returned to the user.

Highlights

📄 Multi-PDF ingestion
🔍 RAG-powered contextual answers
⚡ Fast inference using GROQ and FAISS
📈 Observability via LangSmith

Share on

Bluesky Facebook LinkedIn X (formerly Twitter)