RAG-powered DSPy Documentation Assistant

A Docker-based RAG (Retrieval Augmented Generation) system that provides intelligent querying of DSPy documentation using Ollama, Qdrant, and LlamaIndex.

🚀 Features

**Optimized for Apple MacBook Air M3 - Apple Silicon - 8Gb Internal Memory
RAG Implementation: Uses LlamaIndex for document processing and retrieval
Vector Storage: Qdrant for efficient vector storage and similarity search
Local LLM: Ollama integration with orca-mini model
Optimizations:
- Leverages Mac Performance Shaders (MPS)
- Docker Abstraction
- Robust Error Handling
- Lighweight Inference Models
- Streamlight Query Implementation
- Sentence-level chunking with overlap
- Cross-encoder reranking
- Custom prompt templates
- Efficient vector similarity search
- Asynchronous query processing

🛠 Prerequisites

macOS with Apple Silicon (M1/M2/M3)
Docker Desktop
Ollama Desktop
Streamlight

🔧 System Architecture

┌─────────────────┐ ┌──────────────┐ ┌──────────────┐ │ RAG App │ │ Qdrant │ │ Ollama │ │ - LlamaIndex │────▶│ Vector DB │ │ Local LLM │ │ - HF Embeddings│◀────│ │ │ └─────────────────┘ └──────────────┘ └──────────────┘

📦 Installation

Clone the repository:

git clone <repository-url>
cd <repository-name>

Start Ollama Desktop application
Pull the required model:

This should be automatic in the Dockerfile

Build and run with Docker Compose:

docker compose up --build

🔍 Optimizations

Vector Store (Qdrant)

Persistent storage for document embeddings
Efficient similarity search
Scalable vector database

Document Processing

Sentence-level chunking (1024 tokens with 200 token overlap)
HuggingFace embeddings (all-MiniLM-L6-v2)
Filename-based document tracking

Query Pipeline

Cross-encoder reranking (ms-marco-MiniLM-L-2-v2)
Custom prompt templates for consistent responses
Top-k retrieval with reranking
Async query processing

LLM Integration

Local inference with orca-mini
Optimized for Apple Silicon
Configurable timeout and retry logic

🚀 Usage

Place PDF documentation in the docs directory
Start the system:

docker compose up

Interact with Streamlit:

Streamlit on Localhost

🔧 Configuration

🏗 Architecture Details

RAG Application

Built with Python 3.9
Uses LlamaIndex for document processing
Async support for concurrent operations
Comprehensive logging and error handling

Vector Store

Qdrant for vector similarity search
Persistent storage across sessions
Efficient index management

LLM Integration

Ollama for local inference
Optimized for Apple Silicon
Configurable model parameters

🐛 Troubleshooting

Ollama Connection Issues
- Ensure Ollama Desktop is running
- Check the model is pulled: ollama list
- Verify host.docker.internal is accessible
Memory Issues
- Adjust Docker resource limits
- Modify chunk size and overlap
- Consider using a smaller model
Performance Issues
- Enable MPS acceleration
- Adjust batch sizes
- Monitor resource usage
  
  🤝 Special Thanks
  
  Akshay Pachaar, Avi Chawla - A Crash Course on Building RAG Systems – Part 1 (With Implementation)

=======

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
docker		docker
docs		docs
logs		logs
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
OPTIMIZATIONS.md		OPTIMIZATIONS.md
README.md		README.md
SETUP.md		SETUP.md
activate_rag.sh		activate_rag.sh
clean_docker.sh		clean_docker.sh
deepclean.sh		deepclean.sh
requirements.txt		requirements.txt
run_docker.sh		run_docker.sh
set_permissions.sh		set_permissions.sh
setup.sh		setup.sh
start_qdrant.sh		start_qdrant.sh
start_services.sh		start_services.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG-powered DSPy Documentation Assistant

🚀 Features

🛠 Prerequisites

🔧 System Architecture

📦 Installation

🔍 Optimizations

Vector Store (Qdrant)

Document Processing

Query Pipeline

LLM Integration

🚀 Usage

🔧 Configuration

🏗 Architecture Details

RAG Application

Vector Store

LLM Integration

🐛 Troubleshooting

🤝 Special Thanks

About

Releases

Packages

Languages

License

janussanders/RAG-Docker

Folders and files

Latest commit

History

Repository files navigation

RAG-powered DSPy Documentation Assistant

🚀 Features

🛠 Prerequisites

🔧 System Architecture

📦 Installation

🔍 Optimizations

Vector Store (Qdrant)

Document Processing

Query Pipeline

LLM Integration

🚀 Usage

🔧 Configuration

🏗 Architecture Details

RAG Application

Vector Store

LLM Integration

🐛 Troubleshooting

🤝 Special Thanks

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages