A Retrieval-Augmented Generation (RAG) project using LangChain, PostgreSQL with pgVector, and OpenAI to create an intelligent question-answering system based on web documents.
Imagine you want to ask an AI assistant questions about a specific document, website, or your company's knowledge base. Language models like GPT are very intelligent, but they have a major limitation: they only know information they were trained on and don't have access to your private data or recent information.
RAG solves this problem by combining two capabilities:
- 🔍 Retrieval: Search for relevant information in your documents
- ✨ Generation: Use an LLM to formulate an answer based on that information
Concrete Example:
- You ask: "What are the new product features?"
- The RAG system:
- Searches your technical documentation for relevant sections
- Provides this information to the AI as context
- The AI generates a precise answer based on YOUR data
- 💰 Cost-effective: No need to retrain an expensive model
- ⚡ Fast: Instant updates with new documents
- 🎯 Accurate: Responds with your exact data, not approximations
- 🔒 Secure: Your data remains private
This project demonstrates how to build a complete RAG system that:
- Extracts content from web articles
- Splits content into manageable chunks
- Stores embeddings in a PostgreSQL vector database
- Allows asking questions about the content and getting contextual answers
- Frontend/Interface: Jupyter Notebook for interactive experimentation
- LLM: OpenAI GPT-4o-mini for response generation
- Embeddings: OpenAI text-embedding-3-large for vectorization
- Vector Database: PostgreSQL with pgVector extension
- Framework: LangChain for RAG orchestration
- Containerization: Docker Compose for easy deployment
- Docker and Docker Compose
- OpenAI API Key
- Python 3.8+ (if running locally)
git clone <your-repo-url>
cd rag-langchain# Copy the example file
cp .env.example .env
# Edit the .env file and add your OpenAI API key
# OPENAI_API_KEY=your-openai-api-key-here# Start all services
docker-compose up -d
# Check that services are running
docker-compose ps- Jupyter Lab: http://localhost:8888
- pgAdmin: http://localhost:8080 ([email protected] / admin)
- PostgreSQL: localhost:5432
- Open your browser and go to http://localhost:8888
- Open the notebook
rag-lanchain.ipynb - Execute the cells sequentially to:
- Install dependencies
- Configure the LLM model and embeddings
- Create the PostgreSQL vector store
- Load and process web content
- Split content into chunks
- Store embeddings in the database
If you prefer to run the project locally:
# Install dependencies
pip install -r requirements.txt
# Start only PostgreSQL and pgAdmin
docker-compose up postgres pgadmin -d
# Launch Jupyter locally
jupyter labrag-langchain/
├── rag-lanchain.ipynb # Main notebook with RAG code
├── requirements.txt # Python dependencies
├── Dockerfile # Jupyter container configuration
├── compose.yml # Docker Compose configuration
├── .env.example # Environment variables template
├── README.md # Project documentation (English)
└── README_FR.md # Project documentation (French)
- LangChain: Framework for LLM applications
- LangGraph: Graphs for complex workflows
- langchain-openai: OpenAI integration
- langchain-postgres: PostgreSQL integration
- langchain-text-splitters: Text splitting
- langchain-community: Community loaders and utilities
- PostgreSQL 15 with pgVector extension
- psycopg[binary]: Python-PostgreSQL connector
- Jupyter Lab: Interactive development environment
- Docker: Containerization
- pgAdmin: PostgreSQL administration interface
The project uses as an example Lilian Weng's blog article on AI agents: https://lilianweng.github.io/posts/2023-06-23-agent/
The content is extracted, processed, and indexed to enable intelligent queries.
- Web Extraction: Automatic web content loading with Beautiful Soup
- Text Processing: Intelligent splitting into chunks with overlap
- Vectorization: Text conversion to embeddings via OpenAI
- Vector Storage: Persistence in PostgreSQL with pgVector
- Semantic Search: Vector similarity search
- Contextual Generation: Responses based on retrieved content
The following diagram illustrates the complete RAG system process implemented:
graph TD
%% Indexing Phase
A[🌐 Web Content<br/>Lilian Weng Blog] --> B[🔍 WebBaseLoader<br/>Beautiful Soup Parsing]
B --> C[📄 Raw Document<br/>~43k characters]
C --> D[✂️ Text Splitter<br/>RecursiveCharacterTextSplitter<br/>chunk_size=1000, overlap=200]
D --> E[📝 Document Chunks<br/>~66 fragments]
E --> F[🔢 OpenAI Embeddings<br/>text-embedding-3-large]
F --> G[🗄️ PostgreSQL + pgVector<br/>Vector Store]
%% Query Phase
H[❓ User Question] --> I[🔢 Question Embedding<br/>OpenAI Embeddings]
I --> J[🔍 Similarity Search<br/>pgVector Database]
G --> J
J --> K[📋 Retrieved Chunks<br/>Relevant Context]
K --> L[🤖 OpenAI GPT-4o-mini<br/>LLM Generation]
H --> L
L --> M[✅ Generated Answer<br/>Contextual Response]
%% Styling
classDef webSource fill:#e1f5fe
classDef processing fill:#f3e5f5
classDef storage fill:#e8f5e8
classDef query fill:#fff3e0
classDef output fill:#ffebee
class A webSource
class B,C,D,E,F processing
class G storage
class H,I,J,K query
class L,M output
Project developed as part of learning RAG technologies with LangChain and PostgreSQL.