✨ Supervoice VALL-E 2

Feel free to join my Discord Server to discuss this model!

An independent VALL-E 2 reproduction for voice synthesis with voice cloning.

supervoice_valle.mp4

Features

⚡️ Narural sounding and voice cloning on human level
🎤 High quality - 24khz audio
🤹‍♂️ Versatile - synthesiszed voice has high variability
📕 Currently only English language is supported, but nothing stops us from adding more languages.

Tips and tricks

Network can follow voices, but they better to be in-domain and from librilight, libritts and from others similar sources

Architecture

Repdorduction tries to follow papers as close as possible, but some minor changes include

Linear annielation replaced with cosine one
Not implemented codec grouping
No padding masking used during training, since it would train 5 times slower using flash attention

How to use

import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load model
model = torch.hub.load(repo_or_dir='ex3ndr/supervoice-vall-e-2', model='supervoice')
model = model.to(device)

# Synthesize
in_voice_1 = model.synthesize("voice_1", "What time is it, Steve?", top_p = 0.2).cpu()
in_voice_2 = model.synthesize("voice_2", "What time is it, Steve?", top_p = 0.2).cpu()

# Experimental voices
in_emo_1 = model.synthesize("emo_1", "What time is it, Steve?", top_p = 0.2).cpu()
in_emo_2 = model.synthesize("emo_2", "What time is it, Steve?", top_p = 0.2).cpu()

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
docs		docs
eval		eval
supervoice_valle		supervoice_valle
train		train
voices		voices
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
attention.ipynb		attention.ipynb
benchmark.py		benchmark.py
benchmark.sh		benchmark.sh
datasets.yaml		datasets.yaml
eval.ipynb		eval.ipynb
generate_voices.py		generate_voices.py
hubconf.py		hubconf.py
mkbhd.m4a		mkbhd.m4a
tokenizer_text.model		tokenizer_text.model
tokenizer_text.vocab		tokenizer_text.vocab
train.py		train.py
train.sh		train.sh
train_ar.py		train_ar.py
train_ar.sh		train_ar.sh
train_tokenizer.py		train_tokenizer.py
welcome.ipynb		welcome.ipynb
welcome2.ipynb		welcome2.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

✨ Supervoice VALL-E 2

Features

Tips and tricks

Architecture

How to use

License

About

Releases

Packages

Languages

ex3ndr/supervoice-vall-e-2

Folders and files

Latest commit

History

Repository files navigation

✨ Supervoice VALL-E 2

Features

Tips and tricks

Architecture

How to use

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages