Fine Tuning comes out sounding worse than base model. I must be doing something wrong. #2359
Unanswered
smchughinfo
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, I would like to use my own voice for TTS. Fine tuning is the right approach for that, right?
I noticed the readme.md on github was changed recently to include voice convesion via things like:
however those results were awful. It sounded very robotic and not at all like me. So I kept trying to do fine tuning and it somehow sounds worse than the base model after I get done training it (1000 epochs). I am following these steps:
4: I modify recipes/ljspeech/fast_speech/train_fast_speech.py like so:
path=os.path.join(output_path, "../LJSpeech-1.1-My-Test-Recordings/"),
where ../LJSpeech-1.1-My-Test-Recordings is a copy/paste of the original LJSpeech-1.1 .wav files where I trimmed the original list down to 100 wave files and replaced the first 10 files with my own voice (the previous 90 are the original speaker).
Then I do the train for 1000 epochs but it comes out sounding like static. Not total static. You can tell it's trying but it comes out sounding WORSE than the base model.
Beta Was this translation helpful? Give feedback.
All reactions