English latest checkpoint #97
Replies: 2 comments
-
Hi, I do not have any model trained on latest code nor the time/resources to do it. I can help you debug your training if needed. |
Beta Was this translation helpful? Give feedback.
-
Ah, It seems that you are trying a different audio sample rate to train from scratch. if that is true, you should first familar with the HiFiGAN model framework, You weithts is [184, 192] while current model is [178, 192], most likely you are using a different speech sample rate or a different hop_size for mel frame extraction, either of the two occasion, you should first adjust your decoder [HiFiGAN] parameters to fit you custom config. |
Beta Was this translation helpful? Give feedback.
-
Hello, thank you for writing this cool repo.
Is it possible for you to share the latest lj speech model? I have been struggling to find any on vits2, except the one you shared for 64k steps. There was a guy on vitenamise samples who shared models and configs on Drive, but not the symbols, that's why I'm getting some problems trying to inference it.
RuntimeError: Error(s) in loading state_dict for SynthesizerTrn:
size mismatch for enc_p.emb.weight: copying a param with shape torch.Size([184, 192]) from checkpoint, the shape in current model is torch.Size([178, 192]).
I found the model trained on just VITS, but no matter how long I tried to fine-tune it, I was getting some gibberish-sounding audio. If you have one, please share it with us.
Best regards.
Beta Was this translation helpful? Give feedback.
All reactions