Skip to content

Latest commit

 

History

History
113 lines (73 loc) · 2.99 KB

README.md

File metadata and controls

113 lines (73 loc) · 2.99 KB

TikTok Sales Helper

Originally a project submitted for TikTok Techjam 2024, I continued the development of the project and added more features.

Original project here, which was built on top of WhisperLive, a nearly-live implementation of OpenAI's Whisper.

Demo

demo.mp4

Emotion Detection

emotion detection

Project Overview

This video calling web application aids sales representatives by providing real-time, accurate data about their services and leveraging AI to optimize sales strategies. It integrates live video conferencing, emotion detection, and automatic information retrieval to enhance the efficiency and effectiveness of sales interactions.

Features

  • 📹 Live Video Conferencing
  • 😊 Emotion Detection
    • Trained a Random Forest Classification algorithm from scratch using a synthesized facial dataset. Facial landmarks were generated using Mediapipe and fed into the model. For inference, a frame of the video is sent at a fixed interval to the model for emotion detection.
  • 📝 Real-Time Transcription
  • 🔍 Live Automatic Information Retrieval
    • Implemented using ChromaDB, more details in the implementation section

Technologies

The project was created with:

Getting Started

Run the following commands to run this application on your local machine.

Clone this Respository

git clone https://github.com/martinng01/sales-helper.git

Setting Up the Environment

Set up the Anaconda environment with Python 3.11.9

conda create -n sales-helper python==3.11.9
conda activate sales-helper
pip install -r requirements.txt

Install npm dependencies

cd react-video-call
npm install

Running the Application

Run all 3 code blocks in different terminal windows from the base directory:

  • Frontend
cd react-video-call
npm run dev
  • Application Server
python middleware/middleware.py
  • Transcription Server
python WhisperLive/run_server.py

Visit the localhost website in the frontend terminal window.

Possible Improvements

  • Enable audio and video to be received from the other client (Currently audio and video is received from user's side for proof of concept.)

Implementation

Architecture

architecture

Retrieval Augmented Generation (RAG) Engine

rag