Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replicating 120h spanish dataset results #3

Open
CJai-K opened this issue Jun 21, 2021 · 0 comments
Open

Replicating 120h spanish dataset results #3

CJai-K opened this issue Jun 21, 2021 · 0 comments

Comments

@CJai-K
Copy link

CJai-K commented Jun 21, 2021

Hi @carlfm01,

Thank you for your activity in the various LPCNet/Tacotron-2 discussions. I have been trying to integrate the two models with the steps outlined by @MlWoo and yourself, but the results are not great. I can pick out words but the voice is very hoarse/noisy.

In my latest experiment I try replicating your results with the 120h spanish dataset but the results are still noisy. One things to note in my process is I used the entire dataset in training both models without making any adjustments for multiple speakers. Was this correct to do?

Another question I have is whether I need to do any special preprocessing for this dataset?

Below are my latest alignment and synthesis samples. Thank you and I look forward to your response!

step-165000-align

samples.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant