Skip to content

Latest commit

 

History

History
188 lines (165 loc) · 5.46 KB

README.md

File metadata and controls

188 lines (165 loc) · 5.46 KB

Python OpenAI Generator Srt

OpenAI_Whisper


Application that works offline written in python that transcribes and translates either audio or video files into text to generate a subtitle file (.srt) using deep learning libraries such as openai-whisper and argos-translate that contains the following functionalities:

  • Audio to text transcriptions using the OpenAI Whisper library.
  • Language selection for text translation using the Argos translate library.
  • Video to audio converter using the MoviePy library.
  • Selecting options using command line interface
  • Exception handling.
  • Enums.
  • File Storage.
  • Seeder are in JSON format.
  • Environment Variable
  • Python 3.12
  • The project contains the files to deploy it in Docker.

Screenshots CLI

screen1



screen2



screen3



screen4

Languages available default

You can add the languages ​​you want to translate by modifying the Dockerfile and languages_available.json file.

It also contains the option to translate by the same language.

To From
English Spanish
Spanish English

File storage

The supported formats for both audio and video files can be modified in the file_type.py.

Name Path Description Supported formats
Audios data/audios Directory to save the audio files to later select and generate the subtitles.
  • .mp3
  • .ogg
Videos data/videos Directory to save video files that you can then select, convert into an audio file and generate subtitles.
  • .mp4
  • .mov
  • .wmv
  • .avi
  • .avchd
  • .flv
  • .mkv
  • .webm
  • .html5
  • .mpeg-2
Subtitles data/subtitles Once the subtitles have been generated in .srt format, the result will be saved in the data/subtitles folder.
  • .srt

Recommended requirements

  • Use an Nvidia graphics card that supports CUDA to run the application as it will significantly reduce the transcription process compared to the CPU.

Setup

$ apt-get install ffmpeg
$ git clone https://github.com/JAVI-CC/python-openai-generator-srt
$ cd python-openai-generator-srt
$ cp .env.example .env # optional
$ pip install --no-cache-dir --upgrade -r requirements.txt
# Install translation languages
$ argospm install translate-en_es
$ argospm install translate-es_en
$ python app/main.py

Configure values in the .env file (Optional)

# [tiny, base, small, medium, large, turbo]
WHISPER_LOAD_MODEL="medium"


Deploy to Docker 🐳

Docker repository: https://hub.docker.com/r/javi98/python-openai-generator-srt

Requirements

  • Docker installed on your machine.
  • A machine with an NVIDIA GPU that supports CUDA.
  • Install the NVIDIA Container Toolkit to be able to use the GPU in a docker container.

Containers:

  • nvidia/cuda:12.5.1-base-ubuntu20.04

Containers structure:

├── python-openai-generator-srt-app

Setup:

$ git clone https://github.com/JAVI-CC/python-openai-generator-srt
$ cd python-openai-generator-srt
$ cp .env.example .env # optional
$ docker compose up -d
$ docker compose exec app python3 /code/app/app/main.py

Once you have the containers deployed, You will be shown the CLI to choose the file and language you want to generate the srt file in.