Skip to content

dnzengou/autonomous-intelligence

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tau - a personal friendly assistant

This is Tau!
Tau is inspired by Pi.AI and if you havent tried Pi yet, I strongly encourage you to try.
Like Pi, Tau's conversation is on continual conversation, unlike Chat based bots which feature many conversations and threads.
This is by design - Tau has a single conversation, like speaking to a human.
This is reflected by consulting Tau in decisions made along development: Order of features, voice type, etc.

Tau is a personal fun project.
I opened it as an open source for anyone to experiment with (fork), or just follow. (A star is appreciated!)
If you fork - delete history and facts to reset their knowledge and embark the journey anew!

Update status

  • System Prompt: Speech-actions speak conversation structure.
  • Conversation loop: A continueous conversation with ongoing context.
  • Immediate memory: Reduce context by summarizing it to key points. Inject memory to System prompt.
  • Long term memory: Save the running memory to vector database.
  • Speech: Voice based conversation with hearing and speaking. (Whisper and OpenAI TTS)
  • Vision infra: Set up Hailo-8L as an internal vision webservice.
    • Setup Hailo-8L on Raspberry Pi, validate examples work.
    • Look for best practices and options for integrating Hailo in your application.
    • Set up a submodule git repo with hailo-as-a-servicehailo on
  • Vision: Add Hailo-8L support for at least 1 model.
  • Vision: Add Hailo-8L support for a family of models (Object detection, face recognition, pose detection).
  • Vision - Scene detection.
  • Vision - Text extraction.
  • Long term fetching: Pull from long term memory into context.
  • Entity based memory: Add GraphRAG based memory.
  • Advanced voice: Move to ElevenLabs advanced voices.
  • Introspection: Add Introspection agent for active and background thinking and processing.
  • Growth: Add nightly finetuning, move to smaller model.

Prerequisites

Tau should be able to run on any linux with internet, but was tested only on a raspberry pi 5 8GB with official OS 64bit.

Keys

All needed keys are in .env_sample.
Copy it to .env and add your keys.
Currently, the main key is OpenAI (Chat, Speech, Whisper), and VoyageAI + Pinecone is for vectordb

I plan on moving back to Anthropic (3.5 sonnet only)

Groq was used for a fast understand action usecase

Installation

  1. Clone this repository to your Raspberry Pi:
git clone https://github.com/your-username/your-repo.git
  1. Copy .env_sample to .env and add all keys:
  • ANTHROPIC_API_KEY: used for Claude based text completion and vision. Currently unused.
  • OPENAI_API_KEY: Used for Speech, Whisper, vision and text.
  • GROQ_API_KEY: Used for a super quick action understanding, May be replaced with embeddings.
  • VOYAGE_API_KEY: VoyageAI is recommended by Anthropic. They offer the best embeddings to date (of when I selected it), and offer a great option for innovators.
  • PINECONE_API_KEY: API Key of pinecone. Serverless is a great option.
  • PINECONE_DIMENSION: Dimension of the embeddings generated by Voyage. Used for the setup of Pinecone
  • PINECONE_INDEX_NAME: Name of the index in Pinecone, for memory

Usage

Run this out of order, preferred before microphone

  • tau_speech.py: this consumes speech events, and produces actual speech

There are four programs to run by this order:

  • services/face_service.py: this starts the face app, and reacts when speech occurs
  • tau.py: this is the main LLM conversation loop
  • services/microphone_listener.py this listens to your speech and emits events to tau.py as input

License

This project is licensed under the MIT License.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.3%
  • Shell 1.7%