Customizable MKV audio and subtitle track selection with dry-run and full-run modes, comprehensive logging, and the ability to skip previously processed files on subsequent runs.
Are you tired of switching audio or subtitle tracks every time you start a video because your preferred tracks aren’t set as default? If you have a large media collection with incorrectly flagged audio and subtitle tracks, this script is for you. It automates the process of setting your preferred tracks as default while ensuring other tracks are explicitly marked as not default, saving you time and effort.
- Audio Track Selection:
- Automatically selects the audio track that matches your preferred languages (ranked list).
- Marks the selected audio track as
default
. - Ensures all other audio tracks are explicitly marked as
not default
.
- Subtitle Track Selection:
- Automatically selects the best subtitle track based on your preferences:
- A ranked list of subtitle languages.
- Tracks with specific keywords (e.g., "dialogue") are prioritized.
- Tracks with unwanted keywords (e.g., "sign", "song") are excluded.
- Marks the selected subtitle track as
default
. - Ensures all other subtitle tracks are explicitly marked as
not default
.
- Automatically selects the best subtitle track based on your preferences:
- Dry-Run Mode:
- Simulates changes without modifying files.
- Logs detailed information about what changes would be made.
- Batch Processing:
- Processes all
.mkv
files in a given directory (recursively). - Skips files that have already been processed (tracked in a log file).
- Processes all
- Comprehensive Logging:
- Logs all actions, including processed files, skipped files, and errors.
- Error Handling:
- Cleans up temporary files in case of unexpected errors.
I’m not a professional coder, and this script was written entirely using ChatGPT. I wanted to share it because it worked really well for me, and it might help others too — or at the very least serve as a base for your own modifications. Please note that I’ve successfully used it on over 3,500 anime episodes without issues, but I can’t guarantee flawless results for all use cases. Before running this script on your entire library, test it on a small set of files with varying configurations. I am not responsible for any issues, such as data loss or unintended changes. Always back up your media beforehand — this is a best practice for any automated batch processing. I’ll maybe consider handling issues or feature requests if I have the time or ChatGPT works well with me.
Feel free to fork, tweak, and improve this script as you see fit. Enjoy!
This script depends on Python 3.6+ and mkvmerge
(from MKVToolNix). Follow the instructions for your operating system below.
- Install Python:
- Download and install Python 3.x from python.org.
- During installation, check the box that says “Add Python to PATH”.
- Install MKVToolNix:
- Download the MKVToolNix installer from the official MKVToolNix site.
- Install and add the
mkvmerge
executable to your system PATH.
- Install
tqdm
(optional dependency for progress bars):-
Open Command Prompt and run:
pip install tqdm
-
-
Install Homebrew (if not already installed):
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
-
Install Python and MKVToolNix:
brew install python mkvtoolnix
-
Install
tqdm
:pip3 install tqdm
-
Install Python:
sudo apt update sudo apt install python3 python3-pip
-
Install MKVToolNix:
sudo apt install mkvtoolnix
-
Install
tqdm
:pip3 install tqdm
The script is customizable through several configuration options. Below is the complete breakdown:
DRY_RUN = True # Set to True for dry-run mode (no changes made); False for full-run
INPUT_DIRECTORY = "/path/to/input/directory" # Directory containing MKV files
LOG_DIRECTORY = "/path/to/log/directory" # Directory for log files
DRY_RUN
: Set toTrue
for testing without modifying files. Set toFalse
for full execution.INPUT_DIRECTORY
: The folder containing the.mkv
files you want to process.LOG_DIRECTORY
: Where logs (activity and processed files) will be stored.
AUDIO_PREFERRED_LANGUAGES = ["jpn", "kor", "tgl"] # Ranked list of preferred audio languages
- Add your preferred audio languages in order of priority (use language codes like "eng", "jpn", etc.).
- Example: To prioritize Japanese audio, followed by Korean, then Tagalog, use the above setting.
SUBTITLE_PREFERRED_LANGUAGES = ["eng", "und", "rus"] # Ranked list of subtitle languages
EXCLUDED_SUBTITLE_KEYWORDS = ["sign", "song"] # Exclude tracks with these keywords
PREFERRED_SUBTITLE_KEYWORDS = ["dialogue", "dialog"] # Prefer tracks with these keywords
SUBTITLE_PREFERRED_LANGUAGES
: Ranked list of your preferred subtitle languages.EXCLUDED_SUBTITLE_KEYWORDS
: Tracks with these keywords in their titles will be ignored. Use[]
if you have no exclusions.PREFERRED_SUBTITLE_KEYWORDS
: Keywords that will prioritize a subtitle track. Use[]
if you have no preferences.
For configuring the audio and subtitle languages, refer to the ISO 639-2 language codes. Some common examples include:
- "jpn": Japanese
- "eng": English
- "spa": Spanish
-
Open a terminal/command prompt.
-
Navigate to the directory containing the script:
cd /path/to/script
-
Run the script:
python3 mkv_track_optimizer.py
To preview the changes without modifying the files:
DRY_RUN = True
Output:
Dry-run: Audio Track 1 would be set as default.
Dry-run: Audio Track 2 would be unset as default.
Dry-run: Subtitle Track 3 would be set as default.
Dry-run: Subtitle Track 4 contains excluded keywords and would not be default.
When you're ready to apply the changes:
DRY_RUN = False
Output:
Processing: MyAnimeEpisode.mkv
Adjusting tracks...
Audio Track 1 set as default.
Audio Track 2 marked as not default.
Subtitle Track 3 set as default.
Subtitle Track 4 marked as not default.
Done:
If the script encounters a file that has already been processed, the output would look like this:
-------------------------------
Processing File: /path/to/your/media/Anime/Season 01/Episode01.mkv
-------------------------------
File has already been processed. Skipping...
This message appears under the file header and clearly indicates that the script is skipping the file because it is listed in the processed log. This ensures efficiency by preventing redundant work.
The script has worked really well for me, and I hope it works just as well for you. However, I completely understand if some adjustments might be needed for your specific use case. Feel free to fork the repository, tweak the code, and make any improvements that suit your needs.
If you encounter any issues or have feature requests, let me know, and I might get around to working on them -- time and ChatGPT cooperation permitting. 🫠
Hopefully, this script saves you some time and frustration with your media organization. Enjoy!