How to use PTH file as voice? #669
Unanswered
theymightbedavis
asked this question in
Q&A
Replies: 1 comment
-
So the pth file isnt considered a voice, as when you pretrain using a voice, you generate another copy of the model more suited to generating in the style of that voice. Replace the .pth file that it uses by default with the one that you have. EDIT: My apologies, I am more familiar with another TTS model, I didn't realise the conditioning latents are also .pth files. Could you please verify if the .pth files were made with the |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I'm a beginner with this, and running Tortoise TTS on a Kaggle notebook. I am trying to use a pre-trained voice model in a PTH file. Has anyone had success with using PTH files instead of putting multiple wav files into a voices folder?
I have created a folder within the voices folder, called "customvoice" and put the PTH file in there.
However, I am not able to run the gen=tts part. I get an error. Here is the code I am running:
#@markdown Pick one of the voices from the output above
voice = 'customvoice' #@param {type:"string"}
#@markdown Load it and send it through Tortoise.
voice_samples, conditioning_latents = load_voice(voice)
gen = tts.tts_with_preset(text, voice_samples=voice_samples, conditioning_latents=conditioning_latents,
preset=preset)
torchaudio.save('generated.wav', gen.squeeze(0).cpu(), 24000)
IPython.display.Audio('generated.wav')
Here is the error I am getting:
ValueError Traceback (most recent call last)
/tmp/ipykernel_27/3896845072.py in
6 #gen = tts.tts_with_preset(text, voice_samples=voice_samples, conditioning_latents=conditioning_latents,
7 # preset=preset)
----> 8 gen = tts.tts_with_preset(text, voice_samples=voice_samples, conditioning_latents=conditioning_latents)
9
10 torchaudio.save('generated.wav', gen.squeeze(0).cpu(), 24000)
/kaggle/working/tortoise-tts-fast/tortoise/api.py in tts_with_preset(self, text, preset, **kwargs)
532 settings.update(presets[preset])
533 settings.update(kwargs) # allow overriding of preset settings with kwargs
--> 534 return self.tts(text, **settings)
535
536 def tts(
/kaggle/working/tortoise-tts-fast/tortoise/api.py in tts(self, text, voice_samples, conditioning_latents, k, verbose, use_deterministic_seed, return_deterministic_state, latent_averaging_mode, num_autoregressive_samples, temperature, length_penalty, repetition_penalty, top_p, max_mel_tokens, cvvp_amount, diffusion_iterations, cond_free, cond_free_k, diffusion_temperature, sampler, half, original_tortoise, **hf_generate_kwargs)
636 )
637 elif conditioning_latents is not None:
--> 638 auto_conditioning, diffusion_conditioning = conditioning_latents
639 else:
640 (
ValueError: too many values to unpack (expected 2)
Please help!
Beta Was this translation helpful? Give feedback.
All reactions