Transcription Service App is a web app for universities to make simple transcriptions from video or audio files in multiple languages, currently tailored towards Open AI's Whisper models.
Some of its features are:
- Supports transcriptions with or without simultaneous translations to multiple languages.
- Simple interface.
- Access to two of Open AI's Whisper models (base and large-v3).
- Supports upload from videos and audio files (up to 1gb) as well as YouTube links.
- Users can edit and download transcription results in 4 different formats (txt, vtt, srt and json).
- Diarization support to detect multiple speakers (up to 20).
- Srt, vtt and json formats provide timestamp and speaker information (when available).
- Transcribed subtitles can be activated in uploaded videos.
You first need to set up a whisperx API server to work with this app.
Some environment variables should be set. Here is an example of a .env file:
# PATH to the ffmpeg library in your system
FFMPEG_PATH=/usr/bin/ffmpeg
# Path where temporal files will be generated
TEMP_PATH=transcription-whisper-temp
# Uncomment this up if you're using an authentication process to allow users to log out
#LOGOUT_URL=/oauth2/sign_out
# Url and port to the API server
API_URL=http://111.111.111.11:11300
The app is developed in the streamlit framework.
You can install the requirements needed to run and develop the app using pip install -r requirements.txt
.
Then simply run a development server like this:
streamlit run app.py
virtUOS