Skip to content

RagCheck is a proactive corpus quality assessment tool that analyses RAG application document collections before deployment, identifying content gaps and providing specific recommendations to improve query performance. The platform transforms reactive corpus fixes into proactive quality assurance, helping organisations achieve as high as 85% score.

Notifications You must be signed in to change notification settings

neomatrix369/AIE7-Demo-Day-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

RagCheck | RAG

Python TypeScript Next.js FastAPI Docker

A 6-screen wizard application for comprehensive RAG system quality assessment using client-provided/approved queries (with support for AI-generated and RAGAS-generated questions when necessary) with real document processing, vector similarity search, advanced gap analysis, and interactive data visualization.

Visuals

: RagCheck Analysis Results: Quality Score : : RagCheck Gap Analysis Results :
: RagCheck Heatmap :

๐Ÿš€ Quick Start (Docker - Recommended)

# 1. Set up environment
cp .env.example .env  # Add your OPENAI_API_KEY

# 2. Start all services with one command
./start-services.sh

Services:

Service Management

Lifecycle Management:

./start-services.sh               # Start all services (recommended)
./stop-services.sh                # Interactive stop with 4 options:
                                  #   1. Standard stop (cleans up, preserves data)
                                  #   2. Quick pause (fastest restart)
                                  #   3. Deep cleanup (reclaim disk space)
                                  #   4. Nuclear reset (โš ๏ธ deletes all data)
./scripts/health-check.sh         # Monitor service health

Individual Services (advanced):

docker-compose up qdrant          # Just vector database
docker-compose up backend         # Backend + Qdrant (auto-starts)
docker-compose up frontend        # Full stack (auto-starts all)

Manual Setup (Fallback)

# If Docker isn't available (versions auto-switch: Python 3.12.2, Node.js v22.16.0)
./setup.sh --manual

Documentation

Documentation Description
๐Ÿ—๏ธ Architecture Technical architecture and system design
โญ Features Complete feature overview and capabilities
๐Ÿ”Œ API Reference REST endpoints, WebSocket API, and technologies
๐Ÿš€ Cloud Deployment Vercel, Railway deployment guides
๐Ÿ”ง Troubleshooting Common issues and debugging tips

Prerequisites

Docker Setup (Recommended)

  1. Docker & Docker Compose - For containerized environment
  2. OpenAI API Key - Document embeddings (Get API key)
  3. Data Files - CSV/PDF files in ./backend/data/

Manual Setup (Fallback)

  1. Docker & Docker Compose - Qdrant vector database only
  2. Python 3.12+ & Node.js 22+ - Auto-managed via pyenv/nvm
  3. OpenAI API Key - Document embeddings
  4. Data Files - CSV/PDF files in ./backend/data/

Key Features

  • ๐ŸŽฏ 6-Screen Wizard: Dashboard โ†’ Questions โ†’ Experiment โ†’ Results โ†’ Gap Analysis โ†’ Heatmap
  • โš™๏ธ Advanced Experiments: Comprehensive experiment tracking with timing data, metadata, and chronological ordering
  • ๐Ÿ“Š Quality Metrics Focus: Centralized quality score system replacing business impact with consistent 0-10 scale thresholds
  • ๐Ÿ”„ Dynamic Comparisons: Real-time comparison comments and enhanced experiment analytics
  • ๐Ÿ“Š Advanced Gap Analysis: Domain-agnostic intelligent content gap detection with practical improvement strategies
  • ๐Ÿ’ก Smart Recommendations: Non-ML rule-based engine with priority scoring and impact assessment
  • ๐Ÿ—บ๏ธ Interactive Visualizations: D3.js hexagonal heatmaps with multi-perspective analytics (Documentsโ†’Chunks, Rolesโ†’Chunks) and smart collision detection
  • ๐Ÿ“ˆ Real-time Analytics: Coverage statistics, unretrieved chunk detection, performance insights
  • ๐Ÿ—ƒ๏ธ Vector Storage: Persistent Qdrant database with similarity search and real-time connectivity checks
  • ๐Ÿ“ก Live Updates: WebSocket streaming for experiment progress with comprehensive error handling
  • โฑ๏ธ Experiment Timing: Real-time timing display and comprehensive reproducibility metadata
  • ๐Ÿ’ฌ Custom Tooltips: Consistent balloon tooltips with smart positioning and cursor indicators
  • โšก Performance Optimized: Advanced caching, D3.js rendering optimization, and state management
  • ๐Ÿ”ง Database Integration: Real-time chunk counting and connectivity status with fallback handling

Frontend Shared Modules

  • Shared Utilities (frontend/src/utils/)
    • qualityScore.ts: Centralized quality score calculations and threshold logic
    • constants.ts: Shared constants for quality score thresholds, colors, and categorization
    • heatmapData.ts: Data processing utilities for multiple visualization perspectives
  • Heatmap Utilities (frontend/src/components/heatmap/*)
    • heatmapTheme.ts: Centralized quality score scale, colors, thresholds, legend labels
    • ScatterHeatmap.tsx: Generic hex-grid renderer using shared layout and theme utilities
    • HeatmapControls.tsx, HeatmapLegend.tsx, HeatmapTooltip.tsx: Reusable controls, legends, and smart tooltips
  • Navigation Helper (frontend/src/hooks/usePageNavigation.ts)
    • goTo(path, label?, context?), replace(path, label?, context?), back(context?)
    • Automatically logs navigation via utils/logger.ts with component, action, and context
    • Consistent navigation patterns across all pages with proper logging
  • Gap Analysis Components (frontend/src/components/gap-analysis/*)
    • GapAnalysisDashboard.tsx: Main container with comprehensive gap insights
    • GapAnalysisOverview.tsx: Interactive statistics cards with visual indicators
    • DevelopingCoverageAreas.tsx: Topic-based gap visualization with expandable details for developing coverage areas
    • RecommendationCards.tsx: Prioritized actionable recommendations with implementation tracking

Components

Backend (FastAPI)

  • Document Processing: CSV/PDF loading with LangChain chunking and configurable strategies
  • Vector Operations: Qdrant database integration with OpenAI embeddings and connectivity monitoring
  • Gap Analysis Engine: Non-ML rule-based content gap detection with sophisticated priority scoring algorithms
  • Service Architecture: Manager pattern with QualityScoreService, ExperimentService, GapAnalysisService, and ErrorResponseService
  • Performance Caching: Search result caching (5min TTL) with MD5-based query keys and LRU eviction
  • Real-time Streaming: WebSocket experiment progress updates with error handling
  • Comprehensive Logging: User-friendly logging with development/production modes

Frontend (Next.js)

  • TypeScript Application: Type-safe React components and API integration with comprehensive interfaces
  • Interactive Visualizations: D3.js hexagonal scatter plots with optimized data binding patterns
  • Performance Optimization: API request caching (10min TTL), React state management, and rendering improvements
  • UI/UX Enhancement: Custom BalloonTooltip components with smart positioning and consistent styling
  • Responsive Design: Mobile-friendly CSS Grid and Flexbox layouts with enhanced visual indicators
  • Cross-platform Storage: Adapters for local development and cloud deployment with auto-save functionality

๐Ÿ› Troubleshooting

Service Issues

# Check service health
./scripts/health-check.sh

# Stop services (interactive menu with 4 options)
./stop-services.sh
#   Option 1: Standard stop - Daily use, preserves data
#   Option 2: Quick pause - Fastest restart, no cleanup
#   Option 3: Deep cleanup - Reclaim disk space, keeps data
#   Option 4: Nuclear reset - โš ๏ธ DELETES ALL DATA (troubleshooting only)

# Restart services
./start-services.sh

# View container logs
docker-compose logs backend
docker-compose logs frontend
docker-compose logs qdrant

Port Conflicts

# Check what's using ports
sudo lsof -i :3000
sudo lsof -i :8000  
sudo lsof -i :6333

# Kill conflicting processes
sudo lsof -ti:3000 | xargs kill -9

Environment Issues

# Verify OpenAI API key is set
grep OPENAI_API_KEY .env

# Check Docker container environment
docker-compose exec backend env | grep OPENAI

Built With

AI/ML Stack

LangChain OpenAI Qdrant

Data & Visualization

NumPy Pandas D3.js WebSockets


License GitHub

Contributing

Contributions are welcome! Please feel free to submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

RagCheck is a proactive corpus quality assessment tool that analyses RAG application document collections before deployment, identifying content gaps and providing specific recommendations to improve query performance. The platform transforms reactive corpus fixes into proactive quality assurance, helping organisations achieve as high as 85% score.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published