Releases: cmnemoi/emush_rag_chatbot_poc
v0.2.0
v0.2.0 (2024-11-25)
Chores
-
Switch to fastapi-cli for development (
8a1bc38
) -
Add fastapi-cli to dev dependencies
-
Update run-chatbot command to use fastapi CLI
-
Fix test watch path to match new project structure
-
Enhance Makefile and hooks for better semantic-release integration; update README for eMush RAG Chatbot repo rebranding (
692ffc5
)
Documentation
-
Improve README and development setup (
2e76ffc
) -
Add watch API terminal to VSCode configuration
-
Update README:
- Clarify data sources (wikis, tutorials, forums)
- Add section about improving RAG performance
- Remove manual env file creation (now automated)
- Update example query
-
Fix vector store import in index_documents script
Features
- Add a streamlit app (
886509c
)
Refactoring
-
Flatten project structure (
138fa9b
) -
Move core modules out of src/ directory:
- Move chat_api.py, llm.py, rag_chain.py, and vector_store.py to root package
- Update imports in tests and scripts
-
Update paths in Makefile for scripts
-
Improve README:
- Fix typos and formatting
- Add evaluation section
- Make database and data links more readable
-
Improve project structure and development setup (
b8dc551
) -
Reorganize code with Protocol-based vector store abstraction
-
Add FakeVectorStore for testing purposes
-
Implement lazy initialization for RAG chain
-
Update development tooling:
- Configure Ruff for import sorting
- Add pytest-asyncio for better test support
- Set up automatic env file creation
- Improve test watching configuration
v0.1.0
v0.1.0 (2024-11-24)
Bug Fixes
- Make document indexing asynchronous with await for vector store (
75ac781
)
Documentation
-
Update README with correct command to run the application (
0c89bf9
) -
Update README to use uv for running Python scripts (
e4a1c3c
) -
Update README with detailed project description and usage instructions (
874be74
)
Features
-
Add initial dummy test in test_main.py for basic validation of test structure (
1ba149a
) -
Refactor chat API to remove unnecessary filter_metadata; update vector store and dependencies for better performance and stability (
7bf039f
) -
Update CHAT_MODEL to "gpt-4o-mini" and switch evaluation dataset to "test_set_v3.csv" for improved chatbot performance (
6e536bc
) -
Update chat model to "gpt-4o" and amend evaluation results for consistency in eMush chatbot configuration and responses (
20f3645
) -
Add prompt template for version V6 to enhance context handling and reasoning in eMush chatbot responses (
50e36e8
) -
Remove redundant import of app in main.py for cleaner entry point structure (
659ecde
) -
Update prompt templates in RAGChain for versions V4 and V5 with enhanced reasoning requirements (
a0b68f1
) -
Enhance RAGChain to utilize versioned prompt templates for improved context handling (
9aa4b98
) -
Allow configuring top_k for document retrieval in RAGChain (
6d9d827
) -
Save evaluation results to JSON with metadata and nested structure (
1a81f37
) -
Add question reformulation step to RAG chain for improved context retrieval (
990008d
) -
Enhance evaluation results storage with UUID, timestamp, and CSV export (
068135d
) -
Add new Makefile targets for RAG evaluation, document indexing, and chatbot execution scripts (
e1a761d
) -
Enhance RAG evaluation script with improved response evaluation and add test data for comprehensive assessments (
905beab
) -
Update DocumentLoader to allow larger chunk sizes and improved document splitting logic for long texts (
8239850
) -
Add RAG evaluation script with LLM-as-a-judge functionality (
f471855
) -
Modify chat API to return source documents with response (
ab52c94
) -
Add batch processing and tqdm progress bar for document indexing (
de3a224
) -
Restore batch processing for document indexing (
5cccc32
) -
Add RecursiveCharacterTextSplitter to handle long document chunks (
01261d9
) -
Update document loader to show loading progress and change batch size from 10 to 8 (
d272ad1
) -
Return response with source documents in RAG chain (
de0351a
) -
Add document batching to avoid OpenAI API rate limits (
56bf522
) -
Add title to document metadata in document loader (
ad07daa
) -
Add script to index documents from Mushpedia JSON into vector store (
78dd265
) -
Implement RAG chatbot for eMush game with vector store and FastAPI endpoint (
102a65a
)
This commit adds the core implementation of the RAG-powered chatbot for the eMush game, including:
- Vector store management with Chroma
- RAG chain for context-aware responses
- FastAPI endpoint for chat interactions
- Logging and error handling
- Modular architecture with separate components for vector store, RAG chain, and API
The implementation supports:
-
Semantic document search
-
Chat history context
-
Metadata filtering
-
Async processing
-
Configurable LLM and embedding models
-
Implement vector store for document indexing and similarity search using Chroma and OpenAI embeddings (
388230a
)
Refactoring
-
Modify evaluation script to append results to existing CSV file (
b4e8747
) -
Remove reformulation step from RAG chain (
2f98805
) -
Enhance question reformulation strategy for RAG retrieval (
c6692ac
) -
Remove batch processing from document loading and indexing (
a76d130
) -
Remove batch processing in index_documents.py (
4c7e51a
) -
Modify document loading to remove batch return parameter (
25e587f
) -
Remove batch size logic from document loader (
503d418
) -
Adjust text splitter chunk overlap and remove custom separators (
5d22ca9
) -
Simplify RAG response to return only text response (
51a1845
)