diff --git a/jac/support/jac-lang.org/docs/learn/tutorial/1_setting-up-jac-cloud.md b/jac/support/jac-lang.org/docs/learn/tutorial/1_setting-up-jac-cloud.md new file mode 100644 index 0000000000..5f91a9bbdd --- /dev/null +++ b/jac/support/jac-lang.org/docs/learn/tutorial/1_setting-up-jac-cloud.md @@ -0,0 +1,153 @@ +# Setting Up Your Jac Cloud Application +Jac Cloud is a jaclang plugin that bootstraps your jac application into a running web server. It allows you to serve your jac code as a REST API and interact with it from any client that can make HTTP requests. To set up Jac Cloud, you need to install the `jac-cloud` python package using pip: + +```bash +pip install jac-cloud +``` + +**Note**: jac-cloud requires jaclang version 0.7.18 or later. Make sure you have the latest version of jaclang installed. + +## Setting Up Your Database +Like most API servers, jac-cloud requires a database to store data persistently. Jac-cloud uses mongoDB as the database engine. You will need to have a running mongodb service. You can do this in several ways: + +- Setup MongoDB manually on your local maching +- Using a container service like docker +- Using a free cloud service like [MongoDB Atlas](https://www.mongodb.com/products/platform/atlas-database). + +In this tutorial we will show you how to do this manually and also using Docker (recommended). + +### Running a Mongodb Replica Set Manually +To set up a mongodb replica set, follow these steps: + +- Running a mongoDB replica set locally. + - mongoDB community edition is free to use and run locally. Follow the installation and starting instructions from the mongoDB documentation [here](https://www.mongodb.com/docs/manual/installation/). Select the right one based on your OS. + - After installing, start the mongoDB service by running the following command: + ```bash + mongod --dbpath DB_DATA_PATH --replSet my-rs + ``` + Replace `DB_DATA_PATH` with the path to your database data directory. It can be any directory. It will be used to store the files for the database. This will start the mongoDB service with a replica set named `my-rs`. + - **First time only**: The first time you start the mongodb, do the following two quick steps + - Run the command `mongosh` in another terminal to open the mongo shell. + - In the mongo shell, run the following command to initiate the replica set: + ```bash + rs.initiate() + ``` + This command will initiate the replica set with the default configuration. You can customize the configuration as needed. + - Run `Exit` to exit the mongo shell. + +### Running a Mongodb Replica Set using Docker +To set up a mongodb replica set using Docker, follow these steps: + +- Ensure you have Docker installed on your machine. You can download and install Docker from the official website [here](https://www.docker.com/products/docker-desktop). +- Pull the mongoDB image from Docker Hub: +```bash +docker pull mongodb/mongodb-community-server:latest +``` +- Run the image as a container: +```bash +docker run --name mongodb -p 27017:27017 -d mongodb/mongodb-community-server:latest --replSet my-rs +``` +This command will start a mongoDB container with the name `mongodb` and expose port `27017` to the host machine. +- Install the mongo shell to connect to the mongoDB container. You can install the mongo shell by following the instructions [here](https://www.mongodb.com/docs/mongodb-shell/install/). +- Connect to the MongoDB Deployment with mongosh +```bash +mongosh --port 27017 +``` +- **First time only**: The first time you start the mongodb, do the following two quick steps + - Run the command `mongosh` in another terminal to open the mongo shell. + - In the mongo shell, run the following command to initiate the replica set: + ```bash + rs.initiate({_id: "my-rs", members: [{_id: 0, host: "localhost"}]}) + ``` + This command will initiate the replica set with the default configuration. You can customize the configuration as needed. + - Run `Exit` to exit the mongo shell. + +Additional setup instructions can be found [here](https://docs.mongodb.com/manual/tutorial/deploy-replica-set/). + +## Installing your VSCode Extension +To make your development experience easier, you should install the jac extension for Visual Studio Code. This extension provides syntax highlighting, code snippets, and other features to help you write Jac Cloud code more efficiently. You can install the extension from the Visual Studio Code marketplace [here](https://marketplace.visualstudio.com/items?itemName=jaseci-labs.jaclang-extension). + +![Jac Extension](images/1_vscode.png) + +## Your First Jac Cloud Application +Now that you have your database set up, you can start building your first Jac Cloud application. Create a new file called `app.jac` and add the following code: + +```jac +walker interact { + can return_message with `root entry { + report { + "message": "Hello, world!" + } + } +} + +walker interact_with_body { + has name: str; + + can return_message with `root entry { + report { + "message": "Hello, " + self.name + "!" + } + } +} +``` + +This code defines two walkers, `hello_world_no_body` and `hello_world_with_body`. The `hello_world_no_body` walker returns a simple message "Hello, world!" when called. The `hello_world_with_body` walker takes a `name` parameter and returns a message "Hello, `name`!". No need to worry about what a walker is for now. Think of it as a function that can be called as an API endpoint. + +Now, let's serve this code using Jac Cloud by running the following command: + +```bash +DATABASE_HOST=mongodb://localhost:27017/?replicaSet=my-rs jac serve server.jac +``` + +This command starts the Jac Cloud server with the database host set to `mongodb://localhost:27017/?replicaSet=my-rs`. The server will serve the code in `server.jac` as an API. You can now access the API at `http://localhost:8000`. Go to `http://localhost:8000/docs` to see the Swagger documentation for the API. It should look something like this: + +![Swagger Docs](images/1_swagger.png) + +Now, before we can fully test the API, it is important to know that by default, Jac Cloud requires authentication to access the API. So we need to create a user and get an access token to access the API. You can do this using the Swagger UI or by making HTTP requests. We will show you how to do this using HTTP requests. + +To do this, we need to run the following command: + +```bash +curl --location 'http://localhost:8000/user/register' \ +--header 'Content-Type: application/json' \ +--header 'Accept: application/json' \ +--data '{ + "password": "password", + "email": "test@mail.com" +}' +``` + +This command will create a user with the username `test@mail.com` and password `password`. Next, we'll need to login and get the access token. To do this, run the following command: + +```bash +curl --location 'http://localhost:8000/user/login' \ +--header 'Content-Type: application/json' \ +--header 'Accept: application/json' \ +--data '{ + "password": "password", + "email": "test@mail.com" +}' +``` + +You should see a response similar to this: + +```json +{"token":"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpZCI6IjY2ZGYzN2Y0MjIzNDM2N2QxZDMzMDE1MSIsImVtYWlsIjoidGVzdEBtYWlsLmNvbSIsInJvb3RfaWQiOiI2NmRmMzdmNDIyMzQzNjdkMWQzMzAxNTAiLCJpc19hY3RpdmF0ZWQiOnRydWUsImV4cGlyYXRpb24iOjE3MjYwMzAyNDUsInN0YXRlIjoiZGlCQnJOMHMifQ.oFQ5DuUBwzGVedmk4ktesFIelZR0JH8xx7zU4L_Vu3k","user":{"id":"66df37f42234367d1d330151","email":"test@mail.com","root_id":"66df37f42234367d1d330150","is_activated":true,"expiration":1726030245,"state":"diBBrN0s"}} +``` + +Now that you have your access token copy the access token and use it to access the API. You can now test the API using the following command: + +```bash +curl -X POST http://localhost:8000/walker/interact -H "Authorization: Bearer " +``` + +Replace `` with the access token you received. This command will return the message "Hello, world!". + +```json +{"status":200,"reports":[{"response":"Hello, world!"}]} +``` + +You can also do this in the browser by visiting the Swagger docs `http://localhost:8000/docs` and adding the `Authorization` header with the value `Bearer ACCESS TOKEN`. + +That's it! You have successfully set up your Jac Cloud application and served your first API. In the [next](2_building-a-rag-chatbot.md) part we will learn how to build a simple conversational agent using Jac Cloud. Stay tuned! \ No newline at end of file diff --git a/jac/support/jac-lang.org/docs/learn/tutorial/2_building-a-rag-chatbot.md b/jac/support/jac-lang.org/docs/learn/tutorial/2_building-a-rag-chatbot.md new file mode 100644 index 0000000000..db89e8e38b --- /dev/null +++ b/jac/support/jac-lang.org/docs/learn/tutorial/2_building-a-rag-chatbot.md @@ -0,0 +1,472 @@ +# Building a RAG Chatbot with Jac Cloud and Streamlit +Now that we have a jac application served up, let's build a simple chatbot using Retrieval Augmented Generation (RAG) with Jac Cloud and Streamlit as our frontend interface. + +### Preparation / Installation +Make sure you have all the dependencies installed in your environment. If you don't have it installed, you can install it using pip: +```bash +pip install jaclang mtllm mtllm[openai] jaclang_streamlit jac-cloud requests langchain_community chromadb langchain pypdf +``` + +## Building a Streamlit Interface +Before we begin building out our chatbot, let's first build a simple GUI to interact with the chatbot. Streamlit offers several Chat elements, enabling you to build Graphical User Interfaces (GUIs) for conversational agents or chatbots. Leveraging session state along with these elements allows you to construct anything from a basic chatbot to a more advanced, ChatGPT-like experience using purely Python code. + +Luckily for us, jaclang has a plugin for streamlit that allows us to build web applications with streamlit using jaclang. In this part of the tutorial, we will build a frontend for our conversational agent using streamlit. You can find more information about the `jaclang_streamlit` plugin [here](https://github.com/Jaseci-Labs/jaclang/blob/main/support/plugins/streamlit/README.md). + +First, let's create a new file called `client.jac` in the root directory of your project. This file will contain the code for the frontend chat interface. + +We start by importing the necessary modules in Jac: + +- `streamlit` (for frontend UI components) +- `requests` (for making API calls) + +```jac +import:py streamlit as st; +import:py requests; +``` + +- `streamlit` will handle the user interface (UI) of the chatbot. +- `requests` will handle API calls to our backend. + +Now let's define a function bootstrap_frontend, which accepts a token for authentication and builds the chat interface. + +```jac +can bootstrap_frontend (token:str) { + st.write("Welcome to your Demo Agent!"); + + # Initialize chat history + if "messages" not in st.session_state { + st.session_state.messages = []; + } + +``` + +- `st.write()` adds a welcome message to the app. +- `st.session_state` is used to persist data across user interactions. Here, we're using it to store the chat history (`messages`). + + +Now, let's update the function such that when the page reloads or updates, the previous chat messages are reloaded from `st.session_state.messages`. + +```jac + for message in st.session_state.messages { + with st.chat_message(message["role"]) { + st.markdown(message["content"]); + } + } +``` + +- This block loops through the stored messages in the session state. +- For each message, we use `st.chat_message()` to display the message by its role (either `"user"` or `"assistant"`). + +Next, let's capture user input using `st.chat_input()`. This is where users can type their message to the chatbot. + +```jac + if prompt := st.chat_input("What is up?") { + # Add user message to chat history + st.session_state.messages.append({"role": "user", "content": prompt}); + + # Display user message in chat message container + with st.chat_message("user") { + st.markdown(prompt); + } +``` + +- `st.chat_input()` waits for the user to type a message and submit it. +- Once the user submits a message, it's appended to the session state's message history and immediately displayed on the screen. + +Now we handle the backend interaction. After the user submits a message, the assistant responds. This involves sending the user's message to the backend and displaying the response. + +```jac + # Display assistant response in chat message container + with st.chat_message("assistant") { + + # Call walker API + response = requests.post("http://localhost:8000/walker/interact", json={"message": prompt, "session_id": "123"}, + headers={"Authorization": f"Bearer {token}"} + ); + + if response.status_code == 200 { + response = response.json(); + print(response); + st.write(response["reports"][0]["response"]); + + # Add assistant response to chat history + st.session_state.messages.append({"role": "assistant", "content": response["reports"][0]["response"]}); + } + } + } +} +``` + +- The user's input (`prompt`) is sent to the backend using a POST request to the `/walker/interact` endpoint. +- The response from the backend is then displayed using `st.write()`, and the assistant's message is stored in the session state. +- The `session_id` is hardcoded here for simplicity, but you can make it dynamic based on your application’s needs. + +Lastly, we'll define the entry point, which authenticates the user and retrieves the token needed for the bootstrap_frontend function. + +```jac +with entry { + + INSTANCE_URL = "http://localhost:8000"; + TEST_USER_EMAIL = "test@mail.com"; + TEST_USER_PASSWORD = "password"; + + response = requests.post( + f"{INSTANCE_URL}/user/login", + json={"email": TEST_USER_EMAIL, "password": TEST_USER_PASSWORD} + ); + + if response.status_code != 200 { + # Try registering the user if login fails + response = requests.post( + f"{INSTANCE_URL}/user/register", + json={ + "email": TEST_USER_EMAIL, + "password": TEST_USER_PASSWORD + } + ); + assert response.status_code == 201; + + response = requests.post( + f"{INSTANCE_URL}/user/login", + json={"email": TEST_USER_EMAIL, "password": TEST_USER_PASSWORD} + ); + assert response.status_code == 200; + } + + token = response.json()["token"]; + + print("Token:", token); + + bootstrap_frontend(token); +} +``` + +In the entry block: +- First, we define the backend URL and test user credentials. +- We attempt to log the user in. If login fails, we register the user and then log them in. +- Once logged in, the token is extracted and printed. +- Finally, `bootstrap_frontend(token)` is called with the obtained token. + +Now you can run the frontend using the following command: + +```bash +jac streamlit client.jac +``` + +If your server is still running, you can chat with your assistant using the streamlit interface. The response will only be "Hello, world!" for now, but we will update it to use the RAG module shortly. + +Now let's move on to building the RAG module. + +## What is Retrieval Augmented Generation? +Retrieval Augmented Generation is a technique that combines the benefits of retrieval-based and generative conversational AI models. In a retrieval-based model, the model retrieves semantically similar content based on the input. In a generative model, the model generates a response from scratch based on the input. Retrieval Augmented Generation combines these two concepts by first retrieving a set of candidate responses / relevant content and then generating a response based on the retrieved candidates. + +![Retrieval Augmented Generation](images/2_rag.png) + +## Building a Retrieval Augmented Generation Module +In this part we'll be building a simple Retrieval Augmented Generation module using jaclang and adding it to our application. We will use a simple embedding-based retrieval model to retrieve candidate responses and a generative model to generate the final response. Embeddings are vector representations of words or sentences that capture semantic information. We will use the Ollama embeddings model to generate embeddings for the documents and the Chroma vector store to store the embeddings. + +![RAG Module](images/2_rag_search.png) + +### Adding the Retrieval Module +First, let's add a file called `rag.jac` to our project. This file will contain the code for the Retrieval Augmented Generation module. + +Jac allows you to import Python libraries, making it easy to integrate existing libraries such as langchain, langchain_community, and more. In this RAG engine, we need document loaders, text splitters, embedding functions, and vector stores. + +```jac +import: py from langchain_community.document_loaders {PyPDFDirectoryLoader} +import: py from langchain_text_splitters {RecursiveCharacterTextSplitter} +import: py from langchain.schema.document {Document} +import: py from langchain_community.embeddings.ollama {OllamaEmbeddings} +import: py from langchain_community.vectorstores.chroma {Chroma} +import: py os; +``` + +- `PyPDFDirectoryLoader` is used to load documents from a directory. +- `RecursiveCharacterTextSplitter` is used to split the documents into chunks. +- `OllamaEmbeddings` is used to generate embeddings from document chunks. +- `Chroma` is our vector store for storing the embeddings. + +Now let's define the `rag_engine` object that will handle the retrieval and generation of responses. The object will have two properties: `file_path` for the location of documents and `chroma_path` for the location of the vector store. + +```jac +obj rag_engine { + has file_path:str="docs"; + has chroma_path:str = "chroma"; +``` + +The object will have a `postinit` method that runs automatically upon initialization, loading documents, splitting them into chunks, and adding them to the vector database (Chroma). + +```jac + can postinit { + documents:list = self.load_documents(); + chunks:list = self.split_documents(documents); + self.add_to_chroma(chunks); + } +``` + +- The `load_documents` method loads the documents from the specified directory. +- The `split_documents` method splits the documents into chunks. +- The `add_to_chroma` method adds the chunks to the Chroma vector store. + +Let's define the `load_documents` method and the `split_documents` method. + +```jac + can load_documents { + document_loader = PyPDFDirectoryLoader(self.file_path); + return document_loader.load(); + } + + can split_documents(documents: list[Document]) { + text_splitter = RecursiveCharacterTextSplitter(chunk_size=800, + chunk_overlap=80, + length_function=len, + is_separator_regex=False); + return text_splitter.split_documents(documents); + } +``` + +- The `load_documents` method loads the documents from the specified directory using the `PyPDFDirectoryLoader` class. +- The `split_documents` method splits the documents into chunks using the `RecursiveCharacterTextSplitter` class. This ensures that documents are broken down into manageable chunks for better embedding and retrieval performance. + +Next, let's define the `get_embedding_function` method. The `get_embedding_function` ability uses the `OllamaEmbeddings `model to create embeddings for the document chunks. These embeddings are crucial for semantic search in the vector database. + +```jac + can get_embedding_function { + embeddings = OllamaEmbeddings(model='nomic-embed-text'); + return embeddings; + } +``` + +Now, each chunk of text needs a unique identifier to ensure that it can be referenced in the vector store. The `add_chunk_id` ability assigns IDs to each chunk, using the format `Page Source:Page Number:Chunk Index`. + +```jac + can add_chunk_id(chunks:str) { + last_page_id = None; + current_chunk_index = 0; + + for chunk in chunks { + source = chunk.metadata.get('source'); + page = chunk.metadata.get('page'); + current_page_id = f'{source}:{page}'; + + if current_page_id == last_page_id { + current_chunk_index +=1; + } else { + current_chunk_index = 0; + } + + chunk_id = f'{current_page_id}:{current_chunk_index}'; + last_page_id = current_page_id; + + chunk.metadata['id'] = chunk_id; + } + + return chunks; + } +``` + +Once the documents are split and chunk IDs are assigned, we add them to the Chroma vector database. The `add_to_chroma` ability checks for existing documents in the database and only adds new chunks to avoid duplication. + +```jac + can add_to_chroma(chunks: list[Document]) { + db = Chroma(persist_directory=self.chroma_path, embedding_function=self.get_embedding_function()); + chunks_with_ids = self.add_chunk_id(chunks); + + existing_items = db.get(include=[]); + existing_ids = set(existing_items['ids']); + + new_chunks = []; + for chunk in chunks_with_ids { + if chunk.metadata['id'] not in existing_ids { + new_chunks.append(chunk); + } + } + + if len(new_chunks) { + print('adding new documents'); + new_chunk_ids = [chunk.metadata['id'] for chunk in new_chunks]; + db.add_documents(new_chunks, ids=new_chunk_ids); + } else { + print('no new documents to add'); + } + } +``` + +Next, the `get_from_chroma` ability takes a query and returns the most relevant chunks based on similarity search. This is the core of retrieval-augmented generation, as the engine fetches chunks that are semantically similar to the query. + +```jac + can get_from_chroma(query:str,chunck_nos:int=5) { + db = Chroma(persist_directory=self.chroma_path, embedding_function=self.get_embedding_function()); + results = db.similarity_search_with_score(query,k=chunck_nos); + return results; + } +} +``` + +To summarize, we define an object called `rag_engine` with two properties: `file_path` and `chroma_path`. The `file_path` property specifies the path to the directory containing the documents we want to retrieve responses from. The `chroma_path` property specifies the path to the directory containing the pre-trained embeddings. We will use these embeddings to retrieve candidate responses. + +We define a few methods to load the documents, split them into chunks, and add them to the Chroma vector store. We also define a method to retrieve candidate responses based on a query. Let's break down the code: + +- The `load_documents` method loads the documents from the specified directory using the `PyPDFDirectoryLoader` class. +- The `split_documents` method splits the documents into chunks using the `RecursiveCharacterTextSplitter` class from the `langchain_text_splitters` module. +- The `get_embedding_function` method initializes the Ollama embeddings model. +- The `add_chunk_id` method generates unique IDs for the chunks based on the source and page number. +- The `add_to_chroma` method adds the chunks to the Chroma vector store. +- The `get_from_chroma` method retrieves candidate responses based on a query from the Chroma vector store. + +### Setting up Ollama Embeddings +Before we can use the Ollama embeddings model, we need to set it up. You can download the Ollama from the [Ollama website](https://ollama.com/). Once you have downloaded Ollama, you download your desired model by runnnig the following command: + +```bash +ollama pull nomic-embed-text +``` + +This will download the `nomic-embed-text` model to your local machine. + +Next, you can make this model available for inference by running the following command: + +```bash +ollama serve +``` + +### Adding your documents +You can add your documents to the `docs` directory. The documents should be in PDF format. You can add as many documents as you want to the directory. We've included a sample document in the `docs` directory for you to test. + +### Setting up your LLM +Here we are going to use one of the magic features of jaclang called [MTLLM](https://jaseci-labs.github.io/mtllm/). MTTLM facilitates the integration of generative AI models, specifically Large Language Models (LLMs) into programming in an ultra seamless manner right from the Jaclang code. You should have the `mtllm` library installed in your environment. If you don't have it installed, you can install it using `pip install mtllm[OpenAI]`. + +Create a new file called `server.jac` and add the following code: + +```jac +import:py from mtllm.llms {OpenAI}; + +glob llm = OpenAI(model_name='gpt-4o'); + +``` + +Here we use the OpenAI model gpt-4o as our Large Language Model (LLM). To use OpenAI you will need an API key. You can get an API key by signing up on the OpenAI website [here](https://platform.openai.com/). Once you have your API key, you can set it as an environment variable: + +```bash +export OPENAI_API_KEY="" +``` + +Using OpenAI is not required. You can replace this with any other LLM you want to use. For example, you can also you any Ollama generative model as your LLM. When using Ollama make sure you have the model downloaded and serving on your local machine by running the following command: + +```bash +ollama pull llama3.1 +``` + +This will download the `llama3.1` model to your local machine and make it available for inference when you run the `ollama serve` command. If you want use Ollama replace your import statement with the following: + +```jac +import:py from mtllm.llms {Ollama}; + +glob llm = Ollama(model_name='llama3.1'); +``` + +Now that you have your LLM ready let's create a simple walker that uses the RAG module and MTLLM to generate responses to user queries. First, let's declare the global variables for MTLLM and the RAG engine. + +```jac +glob llm = OpenAI(model_name='gpt-4o'); +glob RagEngine:rag_engine = rag_engine(); +``` + +- `llm`: This an MTLLM instance of the model utilized by jaclang whenever we make `by llm()` abilities. Here we are using OpenAI's GPT-4 model. +- `RagEngine`: This is an instance of the `rag_engine` object for document retrieval and processing. + +Next, let's define a node called `session` that stores the chat history and status of the session. The session node also has an ability called `llm_chat` that uses the MTLLM model to generate responses based on the chat history, agent role, and context. If you've ever worked with graphs before, you should be familiar with nodes and edges. Nodes are entities that store data, while edges are connections between nodes that represent relationships. In this case, `session` is node in our graph that stores the chat history and status of the session. + +```jac +node session { + has id: str; + has chat_history: list[dict]; + has status: int = 1; + + can 'Respond to message using chat_history as context and agent_role as the goal of the agent' + llm_chat(message:'current message':str, + chat_history: 'chat history':list[dict], + agent_role:'role of the agent responding':str, + context:'retrieved context from documents':list + ) -> 'response':str by llm(); +} +``` + +**Attributes:** + +- `id`: A unique session identifier. +- `chat_history`: Stores the conversation history. +- `status`: Tracks the state of the session. +- `llm_chat` ability: Takes the current message, chat history, agent role, and retrieved document context as inputs. This ability uses the LLM to generate a response based on these inputs. All without the need for any prompt engineering!! Wow! + +Next we'll define the `interact` walker that initializes a session and generates responses to user queries. Let's briefly discuss what a walker is and does. In a nutshell, a walker is a mechanism for traversing the graph. It moves from node to node, executing abilities and interacting with the data stored in the nodes. In this case, the `interact` walker is responsible for handling user interactions and generating responses. This is one of the key componets of jaclang that makes it so powerful! Super cool right? 🤯 + +```jac +walker interact { + has message: str; + has session_id: str; + + can init_session with `root entry { + visit [-->](`?session)(?id == self.session_id) else { + session_node = here ++> session(id=self.session_id, chat_history=[], status=1); + print("Session Node Created"); + + visit session_node; + } + } +``` + +**Attributes:** + +- `message`: The user's message. +- `session_id`: The unique session identifier. +- `init_session` ability: Initializes a session based on the session ID. If the session does not exist, it creates a new session node. Note that this ability is triggered on `root entry`. In every graph, there is a special node called `root` that serves as the starting point for the graph. A walker can be spawned on and traverse to any node in the graph. It does **NOT** have to start at the root node, but it can be spawned on the root node to start the traversal. + +Now, let's define the `chat` ability which once the session is initialized, will handle interactions with the user and the document retrieval system. + +```jac + can chat with session entry { + + here.chat_history.append({"role": "user", "content": self.message}); + + data = RagEngine.get_from_chroma(query=self.message); + response = here.llm_chat(message=self.message, chat_history=here.chat_history, agent_role="You are a conversation agent designed to help users with their queries based on the documents provided", context=data); + + here.chat_history.append({"role": "assistant", "content": response}); + + report { + "response": response + }; + } +} +``` + +**Logic flow:** + +- The user's message is added to the chat history. +- The RAG engine retrieves candidate responses based on the user's message. +- The MTLLM model generates a response based on the user's message, chat history, agent role, and retrieved context. +- The assistant's response is added to the chat history. +- The response is reported back to the frontend. Here we are the using the special `report` keyword. This is one of the key feature of jac-cloud and operates a bit like a return statement but it does not stop the execution of the walker. It simply add whatever is reported to the response object that is sent back to the frontend. Isn't that cool? 🤩 + +To summarize: + +- We define a `session` node that stores the chat history and status of the session. The session node also has an ability called `llm_chat` that uses the MTLLM model to generate responses based on the chat history, agent role, and context. +- We define a `interact` walker that initializes a session and generates responses to user queries. The walker uses the `RagEngine` object to retrieve candidate responses and the `llm_chat` ability to generate the final response. + +You can now serve this code using Jac Cloud by running the following command: + +```bash +DATABASE_HOST=mongodb://localhost:27017/?replicaSet=my-rs jac serve server.jac +``` + +Now you can test out your chatbot using the streamlit interface. The chatbot will retrieve candidate responses from the documents and generate the final response using the MTLLM model. Ask any question related to the documents you added to the `docs` directory and see how the chatbot responds. + +You can also try testing out the updated endpoint using the swagger UI at `http://localhost:8000/docs` or using the following curl command: + +```bash +curl -X POST http://localhost:8000/walkers/interact -d '{"message": "I am having major back pain, what can i do", "session_id": "123"} -H "Authorization: Bearer ACCESS TOKEN" +``` + +Remember to replace `ACCESS TOKEN` with the access token of your user. + +In the next part of the tutorial, we will enhance the chatbot by adding dialogue routing capabilities. We will direct the conversation to the appropriate dialogue model based on the user's input. \ No newline at end of file diff --git a/jac/support/jac-lang.org/docs/learn/tutorial/3_rag-dialogue-routing-chatbot.md b/jac/support/jac-lang.org/docs/learn/tutorial/3_rag-dialogue-routing-chatbot.md new file mode 100644 index 0000000000..265680e38b --- /dev/null +++ b/jac/support/jac-lang.org/docs/learn/tutorial/3_rag-dialogue-routing-chatbot.md @@ -0,0 +1,144 @@ +# RAG + Dialogue Routing Chatbot +Now that you have built a RAG chatbot, you can enhance it by adding dialogue routing capabilities. Dialogue routing is the process of directing the conversation to the appropriate dialogue model based on the user's input. + +In some cases, the user query that comes in may not be suitable for the RAG model, and it may be better to route the conversation to a different model or a different dialogue system altogether. In this part of the tutorial, you will learn how to build a RAG chatbot with dialogue routing capabilities using Jac Cloud and Streamlit. + +Let's modify the RAG chatbot we built in the previous part to include dialogue routing. We will add a simple dialogue router that will route the conversation to the appropriate dialogue model based on the user's input. + +First, let's add a new enum `ChatType` to define the different dialogue states we will use: + +```jac +enum ChatType { + RAG : 'Need to use Retrievable information in specific documents to respond' = "RAG", + QA : 'Given context is enough for an answer' = "user_qa" +} +``` + +- `RAG` state is used when we need to use retrievable information in specific documents to respond to the user query. +- `QA` state is used when the given context is enough for an answer. + +Next, we'll a new node to our graph called `router`. The `router` node is responsible for determining which type of model (RAG or QA) should be used to handle the query. + +```jac +node router { + can 'route the query to the appropriate task type' + classify(message:'query from the user to be routed.':str) -> ChatType by llm(method="Reason", temperature=0.0); +} +``` + +The `router` node has an ability called `classify` that takes the user query as input and classifies it into one of the `ChatType` states using the ` by llm` feature from MTLLM. So cool! 😎 + +Next, we'll add a new walker called `infer`. The `infer` walker contains the logic for routing the user query to the appropriate dialogue model based on the classification from the `router` node. + +```jac +walker infer { + has message:str; + has chat_history: list[dict]; + + can init_router with `root entry { + visit [-->](`?router) else { + router_node = here ++> router(); + print("Router Node Created"); + router_node ++> rag_chat(); + router_node ++> qa_chat(); + visit router_node; + } + } +``` + +Here we have a new ability `init_router` that initializes the `router` node and creates and edge to two new nodes `rag_chat` and `qa_chat`. These nodes will be used to handle the RAG and QA models respectively. We'll define these nodes later. Let's finish the `infer` walker first. + +```jac + can route with router entry { + classification = here.classify(message = self.message); + + print("Classification:", classification); + + visit [-->](`?chat)(?chat_type==classification); + } +} +``` + +Now we add an ability `route` that classifies the user query using the `router` node and routes it to the appropriate node called `chat` based on the classification. Let's define the `chat` node next. + +```jac +node chat { + has chat_type: ChatType; +} + +node rag_chat :chat: { + has chat_type: ChatType = ChatType.RAG; + + can respond with infer entry { + print("Responding to the message"); + can 'Respond to message using chat_history as context and agent_role as the goal of the agent' + respond_with_llm( message:'current message':str, + chat_history: 'chat history':list[dict], + agent_role:'role of the agent responding':str, + context:'retirved context from documents':list + ) -> 'response':str by llm(); + data = RagEngine.get_from_chroma(query=here.message); + print("Data:", data); + here.response = respond_with_llm(here.message, here.chat_history, "You are a conversation agent designed to help users with their queries based on the documents provided", data); + print("Here:", here); + } +} + +node qa_chat :chat: { + has chat_type: ChatType = ChatType.QA; + + can respond with infer entry { + print("Responding to the message"); + can 'Respond to message using chat_history as context and agent_role as the goal of the agent' + respond_with_llm( message:'current message':str, + chat_history: 'chat history':list[dict], + agent_role:'role of the agent responding':str + ) -> 'response':str by llm(); + here.response = respond_with_llm(here.message, here.chat_history, agent_role="You are a conversation agent designed to help users with their queries"); + print("Here:", here); + } + +} +``` + +We define two new nodes `rag_chat` and `qa_chat` that extend the `chat` node. The `rag_chat` node is used for the RAG model, and the `qa_chat` node is used for a simple question-answering model. Both nodes have the ability `respond` that responds to the user query using the respective model. + +In the `rag_chat` node, we have a new ability `respond_with_llm` that responds to the user query using the RAG model. The ability retrieves the relevant information from the documents and responds to the user query. In the `qa_chat` node, we have a new ability `respond_with_llm` that responds to the user query using a simple question-answering model. + +Lastly, we'll update our `session` node. Instead of directly responding to the user query like before, the `session` will something super cool! 🤯 + +```jac +node session { + can chat with interact entry { + self.chat_history.append({"role": "user", "content": here.message}); + response = infer(message=here.message, chat_history=self.chat_history) spawn root; + self.chat_history.append({"role": "assistant", "content": response.response}); + + report { + "response": response.response + }; + } +} +``` + +In our updated `session` node, we have a new ability `chat` that is triggered by the `interact` walker. This means that when the interact walker traverses to the `session` node, it will trigger the `chat` ability. The `chat` ability will then spawns the `infer` walker on `root`. The `infer` walker will execute its logic to route the user query to the appropriate dialogue model based on the classification. The response from the dialogue model is then appended to the `infer` walker's object and reported back to the frontend. This is the magic of Data Spacial programming! 🪄 Super dope, right? 😎 + + +To summarize, here are the changes we made to our RAG chatbot to add dialogue routing capabilities: + +- Our first addition is the enum `ChatType`, which defines the different types of chat models we will use. We have two types: `RAG` and `QA`. `RAG` is for the RAG model, and `QA` is for a simple question-answering model. This will be used to classify the user query and route it to the appropriate model. +- Next we have a new node `router`, with the ability `classify`. The ability classifies the user query and route it to the appropriate model. +- We have a new walker `infer`, which has the ability `route`. The ability routes the user query to the appropriate model based on the classification. +- We have two new nodes `rag_chat` and `qa_chat`, which are the chat models for the RAG and QA models respectively. These nodes extend the `chat` node and have the ability `respond`. The ability responds to the user query using the respective model. +- In the `rag_chat` node, we have a new ability `respond_with_llm`, which responds to the user query using the RAG model. The ability retrieves the relevant information from the documents and responds to the user query. +- In the `qa_chat` node, we have a new ability `respond_with_llm`, which responds to the user query using a simple question-answering model. +- We update our `interact` walker to include the new `init_router` ability, which initializes the router node and routes the user query to the appropriate model. +- Lastly, we update the `session` node to have the ability `chat`, which is triggered by the when the `interact` walker is on the node. This ability spawns the `infer` walker and reports the response back to the frontend. + +Now that we have added dialogue routing capabilities to our RAG chatbot, we can test it out by running the following command: + +```bash +DATABASE_HOST=mongodb://localhost:27017/?replicaSet=my-rs jac serve server.jac +``` + +Viola! You have successfully built a RAG chatbot with dialogue routing capabilities using Jac Cloud and Streamlit. You can now test your chatbot by interacting with it in the Streamlit frontend. Have fun chatting! \ No newline at end of file diff --git a/jac/support/jac-lang.org/docs/learn/tutorial/images/1_swagger.png b/jac/support/jac-lang.org/docs/learn/tutorial/images/1_swagger.png new file mode 100644 index 0000000000..5db51f8a86 Binary files /dev/null and b/jac/support/jac-lang.org/docs/learn/tutorial/images/1_swagger.png differ diff --git a/jac/support/jac-lang.org/docs/learn/tutorial/images/1_vscode.png b/jac/support/jac-lang.org/docs/learn/tutorial/images/1_vscode.png new file mode 100644 index 0000000000..d6a98a478d Binary files /dev/null and b/jac/support/jac-lang.org/docs/learn/tutorial/images/1_vscode.png differ diff --git a/jac/support/jac-lang.org/docs/learn/tutorial/images/2_graph.png b/jac/support/jac-lang.org/docs/learn/tutorial/images/2_graph.png new file mode 100644 index 0000000000..d2efd27229 Binary files /dev/null and b/jac/support/jac-lang.org/docs/learn/tutorial/images/2_graph.png differ diff --git a/jac/support/jac-lang.org/docs/learn/tutorial/images/2_rag.png b/jac/support/jac-lang.org/docs/learn/tutorial/images/2_rag.png new file mode 100644 index 0000000000..104ede10d3 Binary files /dev/null and b/jac/support/jac-lang.org/docs/learn/tutorial/images/2_rag.png differ diff --git a/jac/support/jac-lang.org/docs/learn/tutorial/images/2_rag_search.png b/jac/support/jac-lang.org/docs/learn/tutorial/images/2_rag_search.png new file mode 100644 index 0000000000..5688d32e2e Binary files /dev/null and b/jac/support/jac-lang.org/docs/learn/tutorial/images/2_rag_search.png differ diff --git a/jac/support/jac-lang.org/docs/learn/tutorial/images/3_chat_interface.png b/jac/support/jac-lang.org/docs/learn/tutorial/images/3_chat_interface.png new file mode 100644 index 0000000000..058cf8900c Binary files /dev/null and b/jac/support/jac-lang.org/docs/learn/tutorial/images/3_chat_interface.png differ diff --git a/jac/support/jac-lang.org/docs/learn/tutorial/readme.md b/jac/support/jac-lang.org/docs/learn/tutorial/readme.md new file mode 100644 index 0000000000..244bf33ce7 --- /dev/null +++ b/jac/support/jac-lang.org/docs/learn/tutorial/readme.md @@ -0,0 +1,15 @@ +# Introduction +In this tutorial, you are going to learn how to build a state-of-the-art conversational AI system with Jac Cloud and the Jac programming language. You will learn the basics of jaclang, how to use large language models, and everything in between, in order to create an end-to-end fully-functional conversational AI system. + +### Preparation / Installation +- Install dependencies: +```bash +pip install jaclang mtllm mtllm[openai] jaclang_streamlit jac-cloud requests langchain_community chromadb langchain pypdf +``` + +### Project Steps +Once you have installed jaclang, you can start building your conversational AI system by following these steps: + +1. [Setting up Jac Cloud](1_setting-up-jac-cloud.md) +2. [Building a RAG Chatbot with Jac Cloud and Streamlit](2_building-a-rag-chatbot.md) +3. [RAG + Dialogue Routing Chatbot](3_rag-dialogue-routing-chatbot.md) diff --git a/jac/support/jac-lang.org/mkdocs.yml b/jac/support/jac-lang.org/mkdocs.yml index 4ee1c44c27..136e872194 100644 --- a/jac/support/jac-lang.org/mkdocs.yml +++ b/jac/support/jac-lang.org/mkdocs.yml @@ -8,6 +8,11 @@ nav: - "start/installation.md" - "start/jac_in_a_flash.md" - Learn: + - Tutorial: + - "learn/tutorial/readme.md" + - "learn/tutorial/1_setting-up-jac-cloud.md" + - "learn/tutorial/2_building-a-rag-chatbot.md" + - "learn/tutorial/3_rag-dialogue-routing-chatbot.md" - For coders: - "learn/guide.md" - "learn/jac_ref.md"