Voice　AI　Chatbot with Screen Sharing

Get started with the Gemini API

This project requires a Gemini API key. Additionally, please be sure to use earphones or headphones to prevent audio loop errors.

Go to Google AI Studio.
Log in with your Google account.
Create an API key.

Features

Voice Interaction: Communicate with the AI using voice commands.
Screen Sharing: The AI can view and analyze your screen in real-time.
Real-Time AI Responses: Receive immediate responses from the AI based on your interactions.
Environment Configuration: Easily manage API keys and configurations using a .env file.

Benefits

Hands-Free Operation: Interact with the AI without the need for typing.
Enhanced Productivity: Manage tasks and gather information efficiently through voice commands.
Visual Assistance: The AI can assist by analyzing your screen content in real-time.

Technical Details

Programming Language: Python
Voice Processing: Utilizes PyAudio for capturing and playing audio streams.
Screen Capture: Uses MSS and Pillow for capturing and processing screen images.
AI Integration: Integrates with Google's Generative AI (genai) for generating responses.
Environment Variables: Managed using python-dotenv.

Setup Instructions

Setting Up Environment Variables

Replace your_gemini_api_key with your Gemini API key in the .env file:
```
GEMINI_API_KEY=your_gemini_api_key
```

Installing Dependencies

Install the required libraries:
```
pip install -r requirements.txt
```

Running the Chatbot

Execute the main.py script to start the voice chatbot.
Interact with the AI using your voice. Type "q" in the terminal to quit the application.

Troubleshooting

Microphone Issues: Ensure that your microphone is properly connected and configured.
API Key Errors: Verify that the GEMINI_API_KEY is correctly set in the .env file.
Permission Errors: Run the script with appropriate permissions if you encounter access issues.
Use Earphones: Please use earphones while using this AI. Using speakers may cause the AI's audio to be picked up by the microphone, leading the AI to mistakenly recognize its own responses as user input, resulting in a loop error.

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.env		.env
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice　AI　Chatbot with Screen Sharing

Get started with the Gemini API

Features

Benefits

Technical Details

Setup Instructions

Setting Up Environment Variables

Installing Dependencies

Running the Chatbot

Troubleshooting

License

About

Releases

Packages

Languages

Tabibito-AI/Talk-Screen-AI

Folders and files

Latest commit

History

Repository files navigation

Voice AI Chatbot with Screen Sharing

Get started with the Gemini API

Features

Benefits

Technical Details

Setup Instructions

Setting Up Environment Variables

Installing Dependencies

Running the Chatbot

Troubleshooting

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Voice　AI　Chatbot with Screen Sharing

Packages