Haystack RAG pipeline for a chatbot.
Install poetry.
pip install poetry
Install required packages using the following command:
poetry install
We're building a Haystack RAG pipeline for a chatbot that answers questions about music lyrics.
For this project, we're using a subset of this Kaggle Dataset: Song Lyrics.
We're using Haystack for Document Store and build the RAG pipeline and Open AI GPT3.5-turbo LLM to build a chatbot with Chainlit for song lyrics.
Check our video presentation: Haystack RAG pipeline for a music lyrics chatbot - Hacktoberfest 2023 by Ploomber
-
Setting up the Document Store
- Initializing an InMemoryDocumentStore with BM25 retrieval capabilities.
-
Data Retrieval
- Downloading the lyrics dataset from Kaggle.
- Loading lyrics data for different artists into dataframes.
- Merging the dataframes into a single dataframe.
- Data preprocessing, including column renaming and conversion to the document store format.
-
Prompt Template
- Defining a
rag_prompt
template for generating responses from music lyrics and user questions.
- Defining a
-
Retriever Configuration
- Configuring a BM25Retriever to work with the document store for document retrieval based on user queries.
-
GPT-3.5 Turbo Configuration
- Setting up a
PromptNode
to utilize the GPT-3.5 Turbo model for generating responses. This includes specifying your OpenAI API key and using therag_prompt
template.
- Setting up a
-
Pipeline Setup
- Creating a pipeline (
pipe
) with two nodes: the retriever and the GPT-3.5 Turbo model.
- Creating a pipeline (
-
Main Function
- Defining the core functionality of the project, where user queries are processed using the pipeline, and responses are sent back to the user.
Welcome message from the chatbot.
We use Chainlit to make the chatbot.
- Linkedin: Monica Regina da Silva, Carlos Bustillo
- Github: MonicaRSilva,cabustillo13