This tool is designed to provide a quick and concise summary of audio and video files. It supports summarizing content either from a local file or directly from YouTube. The tool uses Whisper for transcription and a local version of Mistral AI (Ollama) for generating summaries.
Tip
It is possible to change the model you wish to use.
To do this, change the OLLAMA_MODEL
variable, and download the associated model via ollama
- YouTube Integration: Download and summarize content directly from YouTube.
- Local File Support: Summarize audio files available on your local disk.
- Transcription: Converts audio content to text using Whisper.
- Summarization: Generates a concise summary using Mistral AI (Ollama).
- Transcript Only Option: Option to only transcribe the audio content without generating a summary.
Before you start using this tool, you need to install the following dependencies:
- Python 3.8 or higher
pytube
for downloading videos from YouTube.pathlib
for local file handlingopenai-whisper
for audio transcription.- Ollama for LLM model management.
ffmpeg
(required for whisper)
Clone the repository and install the required Python packages:
git clone https://github.com/damienarnodo/audio-summary-with-local-LLM.git
cd audio-summary-with-local-LLM
pip install -r src/requirements.txt
Download and install Ollama to carry out LLM Management. More details about LLM models supported can be found on the Ollama GitHub.
Download and use the Mistral model:
ollama pull mistral
## Test the access:
ollama run mistral "tell me a joke"
The tool can be executed with the following command line options:
--from-youtube
: To download and summarize a video from YouTube.--from-local
: To load and summarize an audio or video file from the local disk.--transcript-only
: To only transcribe the audio content without generating a summary. This option must be used with either--from-youtube
or--from-local
.
-
Summarizing a YouTube video:
python src/summary.py --from-youtube <YouTube-Video-URL>
-
Summarizing a local audio file:
python src/summary.py --from-local <path-to-audio-file>
-
Transcribing a YouTube video without summarizing:
python src/summary.py --from-youtube <YouTube-Video-URL> --transcript-only
-
Transcribing a local audio file without summarizing:
python src/summary.py --from-local <path-to-audio-file> --transcript-only
The output summary will be saved in a markdown file in the specified output directory, while the transcript will be saved in the temporary directory.
The summarized content is saved as a markdown file named summary.md
in the current working directory. This file includes the transcribed text and its corresponding summary. If --transcript-only
is used, only the transcription will be saved in the temporary directory.