Skip to content

vannam26102000/train_ttsvit_vi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Vietnamese Speech Synthesis with VITS text to speech model and TTS Coqui framework

This repository is dedicated to the customization and training of VITS (Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech) for text-to-speech (TTS) applications using Vietnamese language data, utilizing the TTS Coqui framework. The repository contains the necessary code and resources to train VITS specifically for generating high-quality speech from Vietnamese text.

Pre-requisites

  1. I highly recommend you to use conda virtual environment, with Python 3.11.5.
conda create -n vits python=3.11.5
  1. In this repo, I use TTS framework version 0.17.5 for statibility.
pip install TTS==0.17.5

Inference

from TTS.api import TTS

tts = TTS('vits_tts',
          model_path='path to the .pth file ',
          config_path='path to the config.json file')

tts.tts_to_file(text="Your example text", file_path="your_filename.wav")

Demo

My trained model is published on this HuggingFace space. Due to the resource factor to train the model, the results achieved are not as expected. The upcoming goal is to collect personal data for implementation voice clone.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages