Fine Tuning comes out sounding worse than base model. I must be doing something wrong. #2359

smchughinfo · 2023-07-06T05:16:59Z

smchughinfo
Jul 6, 2023

Hello, I would like to use my own voice for TTS. Fine tuning is the right approach for that, right?

I noticed the readme.md on github was changed recently to include voice convesion via things like:

tts = TTS(model_name="voice_conversion_models/multilingual/vctk/freevc24", progress_bar=False, gpu=True)
tts.voice_conversion_to_file(source_wav="my/source.wav", target_wav="my/target.wav", file_path="output.wav")

however those results were awful. It sounded very robotic and not at all like me. So I kept trying to do fine tuning and it somehow sounds worse than the base model after I get done training it (1000 epochs). I am following these steps:

Follow these instructions - https://tts.readthedocs.io/en/latest/finetuning.html
Prepare my data. I am using LJSpeech models so I do my metadata.csv and .wav according to that pattern.
config.json and .pth points to config and checkpoints file from an existing LJSpeec model that I have downloaded. e.g.:

CUDA_VISIBLE_DEVICES="0" python3 /home/ALeeAnn/Desktop/T1/TTS/recipes/ljspeech/fast_speech/train_fast_speech.py \
--config_path /home/ALeeAnn/.local/share/tts/tts_models--en--sam--tacotron-DDC/config.json \
--restore_path /home/ALeeAnn/.local/share/tts/tts_models--en--sam--tacotron-DDC/model_file.pth

4: I modify recipes/ljspeech/fast_speech/train_fast_speech.py like so:

path=os.path.join(output_path, "../LJSpeech-1.1-My-Test-Recordings/"),

where ../LJSpeech-1.1-My-Test-Recordings is a copy/paste of the original LJSpeech-1.1 .wav files where I trimmed the original list down to 100 wave files and replaced the first 10 files with my own voice (the previous 90 are the original speaker).

Then I do the train for 1000 epochs but it comes out sounding like static. Not total static. You can tell it's trying but it comes out sounding WORSE than the base model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine Tuning comes out sounding worse than base model. I must be doing something wrong. #2359

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Fine Tuning comes out sounding worse than base model. I must be doing something wrong. #2359

smchughinfo Jul 6, 2023

Replies: 0 comments

smchughinfo
Jul 6, 2023