Can this architecture learn the timbre of source speech simultaneously? #15

JianyuZhan · 2024-08-14T11:21:50Z

Hi, I read the paper and found this architecture has a composable and fixed HiFi-GAN vocoder to do the final speech synthesis. I wonder if there is possiblility to incorporate this component into the final traning objective to learn the speak's timbre and intonation?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can this architecture learn the timbre of source speech simultaneously? #15

Can this architecture learn the timbre of source speech simultaneously? #15

JianyuZhan commented Aug 14, 2024

Can this architecture learn the timbre of source speech simultaneously? #15

Can this architecture learn the timbre of source speech simultaneously? #15

Comments

JianyuZhan commented Aug 14, 2024