This is Tau!
Tau is inspired by Pi.AI and if you havent tried Pi yet, I strongly encourage you to try.
Like Pi, Tau's conversation is on continual conversation, unlike Chat based bots which feature many conversations and threads.
This is by design - Tau has a single conversation, like speaking to a human.
This is reflected by consulting Tau in decisions made along development: Order of features, voice type, etc.
Tau is a personal fun project.
I opened it as an open source for anyone to experiment with (fork), or just follow. (A star is appreciated!)
If you fork - delete history and facts to reset their knowledge and embark the journey anew!
- System Prompt: Speech-actions speak conversation structure.
- Conversation loop: A continueous conversation with ongoing context.
- Immediate memory: Reduce context by summarizing it to key points. Inject memory to System prompt.
- Long term memory: Save the running memory to vector database.
- Speech: Voice based conversation with hearing and speaking. (Whisper and OpenAI TTS)
- Vision infra: Set up Hailo-8L as an internal vision webservice.
- Setup Hailo-8L on Raspberry Pi, validate examples work.
- Look for best practices and options for integrating Hailo in your application.
- Set up a submodule git repo with hailo-as-a-servicehailo on
- Vision: Add Hailo-8L support for at least 1 model.
- Vision: Add Hailo-8L support for a family of models (Object detection, face recognition, pose detection).
- Vision - Scene detection.
- Vision - Text extraction.
- Long term fetching: Pull from long term memory into context.
- Entity based memory: Add GraphRAG based memory.
- Advanced voice: Move to ElevenLabs advanced voices.
- Introspection: Add Introspection agent for active and background thinking and processing.
- Growth: Add nightly finetuning, move to smaller model.
Tau should be able to run on any linux with internet, but was tested only on a raspberry pi 5 8GB with official OS 64bit.
All needed keys are in .env_sample.
Copy it to .env and add your keys.
Currently, the main key is OpenAI (Chat, Speech, Whisper), and VoyageAI + Pinecone is for vectordb
I plan on moving back to Anthropic (3.5 sonnet only)
Groq was used for a fast understand action usecase
- Clone this repository to your Raspberry Pi:
git clone https://github.com/your-username/your-repo.git
- Copy .env_sample to .env and add all keys:
- ANTHROPIC_API_KEY: used for Claude based text completion and vision. Currently unused.
- OPENAI_API_KEY: Used for Speech, Whisper, vision and text.
- GROQ_API_KEY: Used for a super quick action understanding, May be replaced with embeddings.
- VOYAGE_API_KEY: VoyageAI is recommended by Anthropic. They offer the best embeddings to date (of when I selected it), and offer a great option for innovators.
- PINECONE_API_KEY: API Key of pinecone. Serverless is a great option.
- PINECONE_DIMENSION: Dimension of the embeddings generated by Voyage. Used for the setup of Pinecone
- PINECONE_INDEX_NAME: Name of the index in Pinecone, for memory
Run this out of order, preferred before microphone
- tau_speech.py: this consumes speech events, and produces actual speech
There are four programs to run by this order:
- services/face_service.py: this starts the face app, and reacts when speech occurs
- tau.py: this is the main LLM conversation loop
- services/microphone_listener.py this listens to your speech and emits events to tau.py as input
This project is licensed under the MIT License.