How to use PTH file as voice? #669

theymightbedavis · 2023-11-07T07:34:18Z

theymightbedavis
Nov 7, 2023

Hi, I'm a beginner with this, and running Tortoise TTS on a Kaggle notebook. I am trying to use a pre-trained voice model in a PTH file. Has anyone had success with using PTH files instead of putting multiple wav files into a voices folder?

I have created a folder within the voices folder, called "customvoice" and put the PTH file in there.

However, I am not able to run the gen=tts part. I get an error. Here is the code I am running:

#@markdown Pick one of the voices from the output above
voice = 'customvoice' #@param {type:"string"}

#@markdown Load it and send it through Tortoise.
voice_samples, conditioning_latents = load_voice(voice)
gen = tts.tts_with_preset(text, voice_samples=voice_samples, conditioning_latents=conditioning_latents,
preset=preset)
torchaudio.save('generated.wav', gen.squeeze(0).cpu(), 24000)
IPython.display.Audio('generated.wav')

Here is the error I am getting:

ValueError Traceback (most recent call last)
/tmp/ipykernel_27/3896845072.py in
6 #gen = tts.tts_with_preset(text, voice_samples=voice_samples, conditioning_latents=conditioning_latents,
7 # preset=preset)
----> 8 gen = tts.tts_with_preset(text, voice_samples=voice_samples, conditioning_latents=conditioning_latents)
9
10 torchaudio.save('generated.wav', gen.squeeze(0).cpu(), 24000)

/kaggle/working/tortoise-tts-fast/tortoise/api.py in tts_with_preset(self, text, preset, **kwargs)
532 settings.update(presets[preset])
533 settings.update(kwargs) # allow overriding of preset settings with kwargs
--> 534 return self.tts(text, **settings)
535
536 def tts(

/kaggle/working/tortoise-tts-fast/tortoise/api.py in tts(self, text, voice_samples, conditioning_latents, k, verbose, use_deterministic_seed, return_deterministic_state, latent_averaging_mode, num_autoregressive_samples, temperature, length_penalty, repetition_penalty, top_p, max_mel_tokens, cvvp_amount, diffusion_iterations, cond_free, cond_free_k, diffusion_temperature, sampler, half, original_tortoise, **hf_generate_kwargs)
636 )
637 elif conditioning_latents is not None:
--> 638 auto_conditioning, diffusion_conditioning = conditioning_latents
639 else:
640 (

ValueError: too many values to unpack (expected 2)

Please help!

CoderCowMoo · 2023-12-07T14:15:12Z

CoderCowMoo
Dec 7, 2023

So the pth file isnt considered a voice, as when you pretrain using a voice, you generate another copy of the model more suited to generating in the style of that voice. Replace the .pth file that it uses by default with the one that you have.

EDIT: My apologies, I am more familiar with another TTS model, I didn't realise the conditioning latents are also .pth files. Could you please verify if the .pth files were made with the get_conditioning_latents.py file?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use PTH file as voice? #669

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

How to use PTH file as voice? #669

theymightbedavis Nov 7, 2023

Here is the error I am getting:

Replies: 1 comment

CoderCowMoo Dec 7, 2023

theymightbedavis
Nov 7, 2023

CoderCowMoo
Dec 7, 2023