Training results for Russian 22050Hz Natasha #32

shigabeev · 2023-08-29T21:28:59Z

shigabeev
Aug 29, 2023

The model has trained for only 58,000 steps with no sdp and no duration discriminator. The number of text encoder layers was increased from 6 to 10.

The model, a sound sample and symbols.py for russian can be found on Google drive.

https://drive.google.com/drive/folders/1v-jGF8k_gfIUHHFafA1qGOtzDm3YlVbs?usp=sharing

beqabeqa473 · 2023-08-30T05:48:31Z

beqabeqa473
Aug 30, 2023

cool. It seems it needs more steps to fix sounding. looks like vocoder is not trained to that point

0 replies

shigabeev · 2023-08-30T10:35:51Z

shigabeev
Aug 30, 2023
Author

This time I trained it with SDP for 138K epochs. Now it sounds far better.

https://drive.google.com/drive/folders/1y8cDIp0MmSP2LS6V8jZ7fjKLKfw33kas?usp=sharing

3 replies

p0p4k Aug 30, 2023
Maintainer

Also with dp-discriminator?

shigabeev Sep 1, 2023
Author

It turns out that your setup doesn't turn off DP discriminator no matter what's written in config, so luckily it has always been turned on. It works!

p0p4k Sep 1, 2023
Maintainer

Oops, I'll look into that today. Thanks for letting me know.

beqabeqa473 · 2023-08-30T18:49:47Z

beqabeqa473
Aug 30, 2023

Also, is it still using hifigan vocoder or you used work from your branch?

…

On 8/30/23, p0p ***@***.***> wrote: Also with dp-discriminator? -- Reply to this email directly or view it on GitHub: #32 (reply in thread) You are receiving this because you commented. Message ID: ***@***.***>

-- with best regards Beqa Gozalishvili Tell: +995593454005 Email: ***@***.*** Web: https://gozaltech.org Skype: beqabeqa473 Telegram: https://t.me/gozaltech facebook: https://facebook.com/gozaltech twitter: https://twitter.com/beqabeqa473 Instagram: https://instagram.com/beqa.gozalishvili

7 replies

shigabeev Sep 1, 2023
Author

Yes, there is a fork of your repo with MS-iSTFT. It sounds majestic. At least, on my data.

https://github.com/FENRlR/MB-iSTFT-VITS2

p0p4k Sep 1, 2023
Maintainer

Should we mix this repo with that, so users get to choose more vocoder options?

p0p4k Sep 1, 2023
Maintainer

@shigabeev Also a few samples to compare all three would be great , if you can manage that! Thanks and I appreciate your efforts!!!

shigabeev Sep 1, 2023
Author

I halted the BigVGAN training at around 45K steps due to its performance being notably inferior to both HFG and iSTFT at that epoch. However, this comparison might not be fully accurate since the batch size varied; BigVGAN consumed 4x the memory compared to iSTFT, prompting an adjustment in the learning rate.

Despite this, I believe given another week, BigVGAN might yield comparable results. Nonetheless, I'm currently inclined towards iSTFT due to its satisfactory performance.

Audio Comparisons:

HFG (190K steps) vs. iSTFT (190K steps)
BigVGAN (40K steps) vs. iSTFT (40K steps)

Link to audios.

p0p4k Sep 1, 2023
Maintainer

Thanks for your reply. I will consider adding these vocoder options in this repo as well.

Subarasheese · 2023-08-30T22:42:37Z

Subarasheese
Aug 30, 2023

@shigabeev Would you mind sharing what were your steps to organize the dataset and prepare it for training? I am trying to train a model on my native language but I am facing issues, as seen here:

#33

3 replies

shigabeev Sep 1, 2023
Author

I can record a stream on youtube if you'll watch.

jswildone Oct 24, 2023

@shigabeev If you did that would be absolutely incredible. Happy to send some money to help you fund the training if so

bzp83 May 21, 2024

hi @shigabeev,
did you record a video? would you mind sharing? thank you!

beqabeqa473 · 2023-09-01T10:16:50Z

beqabeqa473
Sep 1, 2023

It would be cool to see samples.

…

On 9/1/23, p0p ***@***.***> wrote: Oops, I'll look into that today. Thanks for letting me know. -- Reply to this email directly or view it on GitHub: #32 (reply in thread) You are receiving this because you commented. Message ID: ***@***.***>

-- with best regards Beqa Gozalishvili Tell: +995593454005 Email: ***@***.*** Web: https://gozaltech.org Skype: beqabeqa473 Telegram: https://t.me/gozaltech facebook: https://facebook.com/gozaltech twitter: https://twitter.com/beqabeqa473 Instagram: https://instagram.com/beqa.gozalishvili

0 replies

Xmiler · 2023-12-29T09:26:45Z

Xmiler
Dec 29, 2023

@shigabeev hi!
Thanks for sharing your result. Could you tell us the way you pnonemized your dataset please 🙏

0 replies

shigabeev · 2023-12-29T11:06:05Z

shigabeev
Dec 29, 2023
Author

Natasha is already normalized and with accents. So I just added trained NN on it as it is, on graphemes.

…

________________________________ From: Xmiler ***@***.***> Sent: Friday, December 29, 2023 12:26:56 PM To: p0p4k/vits2_pytorch ***@***.***> Cc: Ilya Shigabeev ***@***.***>; Mention ***@***.***> Subject: Re: [p0p4k/vits2_pytorch] Training results for Russian 22050Hz Natasha (Discussion #32) @shigabeev<https://github.com/shigabeev> hi! Thanks for sharing your result. Could you tell us the way you pnonemized your dataset please 🙏 — Reply to this email directly, view it on GitHub<#32 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACFZXD5FAARNTKB5MHZH7MTYL2EGBAVCNFSM6AAAAAA4DR2WYWVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TSNZQHEYTE>. You are receiving this because you were mentioned.Message ID: ***@***.***>

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training results for Russian 22050Hz Natasha #32

{{title}}

Replies: 7 comments 13 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Training results for Russian 22050Hz Natasha #32

Replies: 7 comments · 13 replies

shigabeev Aug 30, 2023 Author

p0p4k Aug 30, 2023 Maintainer

shigabeev Sep 1, 2023 Author

p0p4k Sep 1, 2023 Maintainer

shigabeev Sep 1, 2023 Author

p0p4k Sep 1, 2023 Maintainer

p0p4k Sep 1, 2023 Maintainer

shigabeev Sep 1, 2023 Author

p0p4k Sep 1, 2023 Maintainer

shigabeev Sep 1, 2023 Author

shigabeev Dec 29, 2023 Author

Replies: 7 comments 13 replies

shigabeev
Aug 30, 2023
Author

p0p4k Aug 30, 2023
Maintainer

shigabeev Sep 1, 2023
Author

p0p4k Sep 1, 2023
Maintainer

shigabeev Sep 1, 2023
Author

p0p4k Sep 1, 2023
Maintainer

p0p4k Sep 1, 2023
Maintainer

shigabeev Sep 1, 2023
Author

p0p4k Sep 1, 2023
Maintainer

shigabeev Sep 1, 2023
Author

shigabeev
Dec 29, 2023
Author