A chatbot that plays Call of Cthulhu (CoC) with you, powered by AI.
Check out this transcript:
In the first message, I asked Cocai to generate a character for me:
Can you generate a character for me? Let's call him "Don Joe". Describe what kind of guy he is, then tabulate his stats.
Under the hood, Cocai used Cochar. In the first couple of attempts, Cocai forgot to provide some required parameters. Cocai fixed that problem and successfully generated a character profile from Cochar.
Then, I asked Cocai what I -- playing the character of Mr. Don Joe -- can do, being stuck in a dark cave. It suggested a couple of options and described the potential outcomes associated with each choice.
I then asked Cocai to roll a skill check for me, Spot Hidden. Based on the chat history, Cocai was able to recall that Mr. Joe had a Spot Hidden skill of 85%. It rolled the dice, got a successful outcome, and took some inspiration from its 2nd response to progress the story.
Thanks to the chain-of-thought (CoT) visualization feature, you can unfold the tool-using steps and verify yourself that Cocai was indeed able to recall the precise value of Joe's Spot Hidden skill:
Prominent dependencies of Cocai include:
flowchart TD
subgraph Standalone Programs
o[Ollama]
s[Stable Diffusion Web UI]
subgraph managed by Docker Compose
q[(Qdrant)]
m[(minIO)]
a[Arize Phoenix]
end
end
subgraph Python packages
mem0[mem0]
l[Chainlit]
c[LlamaIndex]
end
s -. "provides drawing capability to" .-> c
o -. "provides LLM & Embedding Model to" .-> c
q --provides Vector DB to --> c
q --provides Vector DB to --> mem0
mem0 --provides short-term memory to --> c
o --provides LLM & Embedding Model to --> mem0
m --provides Object DB to --> l
l --provides Web UI to --> c
a --provides observability to --> c
Zooming in on the programs managed by Docker Compose, here are the ports and local folders (git-ignored) that each container will expose and use:
(Generated via docker run --rm -it --name dcv -v $(pwd):/input pmsipilot/docker-compose-viz render -m image docker-compose.yaml
)
There are a couple of things you have to do manually before you can start using the chatbot.
- Clone the repository (how).
- Install the required binary, standalone programs. These are not Python packages, so they aren't managed by
pyproject.toml
. - Self-serve a text embedding model. This model "translates" your text into numbers, so that the computer can understand you.
- Choose a way to serve a large language model (LLM). You can either use OpenAI's API or self-host a local LLM with Ollama.
- Initialize secrets.
No need to explicitly install Python packages. uv
, the package manager of our choice, will implicitly install the required packages when you boot up the chatbot for the first time.
These are the binary programs that you need to have ready before running Cocai:
just
, a command runner. I use this because I always tend to forget the exact command to run.uv
, the Python package manager that Cocai uses. It does not require you to explicitly create a virtual environment.- Docker. Cocai requires many types of databases, e.g. object storage and vector storage, along with some containerized applications. We need the
docker-compose
command to orchestrate these containers. - Ollama. Doc ingestion and memories are relying on a local embedding model.
- (Optional) Tmuxinator and
tmux
, if you ever want to run the chatbot the easy way (discussed later).
If you are on macOS, you can install these programs using Homebrew:
brew install just uv ollama tmuxinator
brew install --cask docker
Optionally, also install Stable Diffusion Web UI. This allows the chatbot to generate illustrations.
Ensure that you have a local Ollama server running (if not, start one with ollama serve
). Then, download the nomic-embed-text
model by running:
ollama pull nomic-embed-text
The easiest (and perhaps highest-quality) way would be to provide an API key to OpenAI. Simply add OPENAI_API_KEY=sk-...
to a .env
file in the project root.
With the absence of an OpenAI API key, the chatbot will default to using Ollama, a program that serves LLMs locally.
- Ensure that your local Ollama server has already downloaded the
llama3.1
model. If you haven't (or aren't sure), runollama pull llama3.1
. - If you want to use a different model that does not support function-calling, that's also possible. Revert this commit, so that you can use the ReAct paradigm to simulate function-calling capabilities with a purely semantic approach.
Run chainlit create-secret
to generate a JWT token. Follow the instructions to add the secret to .env
.
Start serving minIO for the first time (by running minio server .minio/
if you have a local binary installed, or used Docker Compose command discussed below). Then navigate to http://127.0.0.1:57393/access-keys
and create a new access key. (You may need to log in first. The default credentials can be found in their official documentation.) Add the access key and secret key to .env
:
MINIO_ACCESS_KEY="foo"
MINIO_SECRET_KEY="bar"
Optionally, if you want to enable the chatbot to search the internet, you can provide a Tavily API key. Add TAVILY_API_KEY=...
to .env
.
Optionally, if you prefer to use OpenAI ("GPT") as your LLM, add OPENAI_API_KEY=...
to .env
.
Optionally, if you prefer to use a hosted open LLM, you can try Together.ai. Add TOGETHER_AI_API_KEY=...
to .env
.
There are 2 ways to start the chatbot, the easy way and the hard way.
In the easy way, simply run just serve-all
. This will start all the required standalone programs and the chatbot in one go. Notes:
- Use of multiplexer. To avoid cluttering up your screen, we use a terminal multiplexer (
tmux
), which essentially divides your terminal window into panes, each running a separate program. The panes are defined in the filetmuxinator.yaml
. Tmuxinator is a separate program that managestmux
sessions declaratively. - Don't use the Dockerfile. For a tech demo, I hacked up a
Dockerfile
, which uses thisjust serve-all
command. But thetmuxinator.yaml
file had been updated since, and I'm pretty sure the Dockerfile is broken now.
In the hard way, you want to create a separate terminal for each command:
- Start serving Ollama by running
ollama serve
. It should be listening athttp://localhost:11434/v1
. Details:- This is for locally inferencing embedding & language models.
- I did not containerize this because Docker doesn't support GPUs in Apple Silicon (as of Feb 2024), which is what I'm using.
- Start Docker containers by running
docker-compose up
. This includes:- minIO object database (for persisting data for our web frontend, including user credentials and chat history -- not thought chains, though)
- Arize Phoenix platform (for debugging thought chains)
- Qdrant vector database (for the chatbot's short-term memory -- this is implemented via
mem0
)
- Optionally, start serving a "Stable Diffusion web UI" server with API support turned on by running
cd ../stable-diffusion-webui; ./webui.sh --api --nowebui --port 7860
.- This enables your AI Keeper to draw illustrations.
- If Stable Diffusion is not running, the AI Keeper will still be able to generate text-based responses. It's just that it won't be able to draw illustrations.
- Finally, start serving the chatbot by running
just serve
.
Either way, Cocai should be ready at http://localhost:8000/chat/
. Log in with the dummy credentials admin
and admin
.
If you see:
File ".../llvmlite-0.43.0.tar.gz/ffi/build.py", line 142, in main_posix
raise RuntimeError(msg) from None
RuntimeError: Could not find a `llvm-config` binary. There are a number of reasons this could occur, please see: https://llvmlite.readthedocs.io/en/latest/admin-guide/install.html#using-pip for help.
error: command '.../bin/python' failed with exit code 1
Then run:
brew install llvm
If your uv run phoenix serve
command fails with:
Traceback (most recent call last):
File "Cocai/.venv/bin/phoenix", line 5, in <module>
from phoenix.server.main import main
File "Cocai/.venv/lib/python3.11/site-packages/phoenix/__init__.py", line 12, in <module>
from .session.session import (
File ".venv/lib/python3.11/site-packages/phoenix/session/session.py", line 41, in <module>
from phoenix.core.model_schema_adapter import create_model_from_inferences
File ".venv/lib/python3.11/site-packages/phoenix/core/model_schema_adapter.py", line 11, in <module>
from phoenix.core.model_schema import Embedding, Model, RetrievalEmbedding, Schema
File ".venv/lib/python3.11/site-packages/phoenix/core/model_schema.py", line 554, in <module>
class ModelData(ObjectProxy, ABC): # type: ignore
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases
then you can work around the problem for now by serving Arize Phoenix from a Docker container:
docker run -p 6006:6006 -p 4317:4317 -i -t arizephoenix/phoenix:latest
🧑💻 The software itself is licensed under AGPL-3.0.
📒 The default CoC module, “Clean Up, Aisle Four!” is written by Dr. Michael C. LaBossiere. All rights reserved to the original author. Adopted here with permission.
(A "CoC module" is also known as a CoC scenario, campaign, or adventure. It comes in the form of a booklet. Some CoC modules come with their own rulebooks. Since this project is just between the user and the chatbot, let's choose a single-player module.)
🎨 Logo is an AI-generated artwork by @Norod78, originally published on Civitai). Adopted here with permission.