This application facilitates a chatbot by leveraging Next.js as the frontend and FastAPI as the backend, utilizing the power of LangChain for dynamic web interactions.
- Web Interaction via LangChain: Utilizes the latest LangChain version for effective interaction and information extraction from websites.
- Versatile Language Model Integration: Offers compatibility with various models including GPT-4. Users can easily switch between models to suit their needs.
- User-Friendly Next.js Frontend: The interface is intuitive and accessible for users of all technical backgrounds.
The application integrates the Python/FastAPI server into the Next.js app under the /api/
route. This is achieved through next.config.js
rewrites, directing any /api/:path*
requests to the FastAPI server located in the /api
folder. Locally, FastAPI runs on 127.0.0.1:8000
, while in production, it operates as serverless functions on Vercel.
- Install dependencies:
npm install
- Create a
.env
file with your OpenAI API key:OPENAI_API_KEY=[your-openai-api-key]
- Start the development server:
npm run dev
- Access the application at http://localhost:3000. The FastAPI server runs on http://127.0.0.1:8000.
For backend-only testing:
conda create --name nextjs-fastapi-your-chat python=3.10
conda activate nextjs-fastapi-your-chat
pip install -r requirements.txt
uvicorn api.index:app --reload
Options for preserving chat history include:
- Global Variable: Simple but not ideal for scalability and consistency.
- In-Memory Database/Cache: Scalable solutions like Redis for storing chat history.
- Database Storage: Robust and persistent method, suitable for production environments.
RAG (Retrieval Augmented Generation) enhances language models with context retrieved from a custom knowledge base. The process involves fetching HTML documents, splitting them into chunks, and vectorizing these chunks using embedding models like OpenAI's. This vectorized data forms a vector store, enabling semantic searches based on user queries. The retrieved relevant chunks are then used as context for the language model, forming a comprehensive response to user inquiries.
The get_vectorstore_from_url
function extracts and processes text from a given URL, while get_context_retriever_chain
forms a chain that retrieves context relevant to the entire conversation history. This pipeline approach ensures that responses are contextually aware and accurate.