Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

synthesized wavs of long texts #7

Open
Liujingxiu23 opened this issue Jul 19, 2021 · 2 comments
Open

synthesized wavs of long texts #7

Liujingxiu23 opened this issue Jul 19, 2021 · 2 comments

Comments

@Liujingxiu23
Copy link

Liujingxiu23 commented Jul 19, 2021

I downloaded the pretrained model of databaker and synthesized wavs using inference.py.
The results are not very good, I mean the alignment is not right especially when the input text is long.
For example, "失恋的人特别喜欢往人烟罕至的角落里钻。", the synthesized wavs sounds like:
失恋的人特别喜欢往人烟罕至的_角角落里钻钻钻钻_

For longer input text,the synthesized wavs are totally wrong

@light1726
Copy link
Member

Hi @Liujingxiu23, thanks for your feedback. Attention errors can happen for vaenar-tts since there's no restriction posed to attention alignment to make it monotonic, most of them are repetitions of phonemes. From my observation, such cases are rare. It never occurs to me that the synthesized waveform is totally wrong for a sentence.

Synthesis of long sentences is more challenging as there are not many long sentences in the training set.

I didn't do much parameter-tuning on the Mandarin dataset. I think there are at least 2 points that can be considered to improve the performance of Mandarin TTS:

  1. Use phoneme as input or split Pinyin into consonant and vowel, instead of treating them as a pure character sequence as I do.
  2. For the synthesis of out-of-dataset texts, do the prosodic boundary prediction as in the transcription.

@Liujingxiu23
Copy link
Author

@light1726 Thank you very much for your reply. I will training my mandarin dataset with phone-sequences and prosody boundary infos to see the performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants