Application that works offline written in python that transcribes and translates either audio or video files into text to generate a subtitle file (.srt) using deep learning libraries such as openai-whisper and argos-translate that contains the following functionalities:
- Audio to text transcriptions using the OpenAI Whisper library.
- Language selection for text translation using the Argos translate library.
- Video to audio converter using the MoviePy library.
- Selecting options using command line interface
- Exception handling.
- Enums.
- File Storage.
- Seeder are in JSON format.
- Environment Variable
- Python 3.12
- The project contains the files to deploy it in Docker.
You can add the languages you want to translate by modifying the Dockerfile and languages_available.json file.
It also contains the option to translate by the same language.
To | From |
---|---|
English | Spanish |
Spanish | English |
The supported formats for both audio and video files can be modified in the file_type.py.
Name | Path | Description | Supported formats |
---|---|---|---|
Audios | data/audios | Directory to save the audio files to later select and generate the subtitles. |
|
Videos | data/videos | Directory to save video files that you can then select, convert into an audio file and generate subtitles. |
|
Subtitles | data/subtitles | Once the subtitles have been generated in .srt format, the result will be saved in the data/subtitles folder. |
|
- Use an Nvidia graphics card that supports CUDA to run the application as it will significantly reduce the transcription process compared to the CPU.
$ apt-get install ffmpeg
$ git clone https://github.com/JAVI-CC/python-openai-generator-srt
$ cd python-openai-generator-srt
$ cp .env.example .env # optional
$ pip install --no-cache-dir --upgrade -r requirements.txt
# Install translation languages
$ argospm install translate-en_es
$ argospm install translate-es_en
$ python app/main.py
# [tiny, base, small, medium, large, turbo]
WHISPER_LOAD_MODEL="medium"
Docker repository: https://hub.docker.com/r/javi98/python-openai-generator-srt
- Docker installed on your machine.
- A machine with an NVIDIA GPU that supports CUDA.
- Install the NVIDIA Container Toolkit to be able to use the GPU in a docker container.
- nvidia/cuda:12.5.1-base-ubuntu20.04
├── python-openai-generator-srt-app
$ git clone https://github.com/JAVI-CC/python-openai-generator-srt
$ cd python-openai-generator-srt
$ cp .env.example .env # optional
$ docker compose up -d
$ docker compose exec app python3 /code/app/app/main.py
Once you have the containers deployed, You will be shown the CLI to choose the file and language you want to generate the srt file in.