Can I train the Chinese model? #70

Tsangchi-Lam · 2023-11-23T07:35:19Z

I want to train the Chinese model. Do you support mixed input in Chinese and English?

Kreevoz · 2023-11-23T18:32:54Z

Look at issue #41 to check the current progress.

yl4579 · 2023-11-24T03:21:30Z

You can, but with the current PL-BERT in English the quality won’t be as good it’s originally proposed to be. I’m working on multilingual PL-BERT now and it may take one or two months to finish.

yl4579 · 2023-11-24T03:22:09Z

See yl4579/StyleTTS#10 for more details.

hermanseu · 2023-11-24T06:50:14Z

@yl4579 I trained styletts2 successfully using Chinese data, it sound very good. As wavlm-base-plus only supporting English, I used a Chinese Hubert model as SLM. When I want to train a model both for Chinese and English, I can not find a pre-trained model sopport Chinese and English at the same time. About SLM，Do you have any suggestions ?

yl4579 · 2023-11-24T06:53:04Z

You can try whisper encoder that was trained with multiple languages. You can also try multilingual wav2vec2.0: https://huggingface.co/facebook/wav2vec2-large-xlsr-53

zhouyong64 · 2023-11-24T09:36:45Z

@yl4579 I trained styletts2 successfully using Chinese data, it sound very good.

Did you use the English PL-BERT or did you train PL-BERT with Chinese data?

hermanseu · 2023-11-24T09:59:42Z

train PL-BERT with Chinese data

Moonmore · 2023-11-27T10:35:27Z

I trained styletts2 successfully using Chinese data, it sound very good. As wavlm-base-plus only supporting English, I used a Chinese Hubert model as SLM. When I want to train a model both for Chinese and English, I can not find a pre-trained model sopport Chinese and English at the same time. About SLM，Do you have any suggestions ?

What is your modeling unit? IPA or Pinyin?

hermanseu · 2023-11-28T01:54:19Z

@Moonmore The modeling unit is pinyin.

test.zip is a synth sample.

zhouyong64 · 2023-11-28T03:29:04Z

@Moonmore The modeling unit is pinyin.

test.zip is a synth sample.

Do you use the tone of pinyin when training Chinese PL-BERT? I believe StyleTTS uses F0 for Chinese tones. Can this PL-BERT with tones work with StyleTTS?

hermanseu · 2023-11-28T03:38:52Z

I trained Chinese PL-BERT without pinyin tones. But maybe PL-BERT with tones will also work normally, so you can try.

zhouyong64 · 2023-11-28T04:24:07Z

I trained Chinese PL-BERT without pinyin tones. But maybe PL-BERT with tones will also work normally, so you can try.

How many samples did you use to train Chinese PL-BERT?

hermanseu · 2023-11-29T01:04:50Z

@zhouyong64 I used about 84,000,000 text sentences to train the Chinese PL-BERT model.

Moonmore · 2023-11-29T02:12:28Z

@Moonmore The modeling unit is pinyin.

test.zip is a synth sample.

Sounds really good. I would like to ask if the pinyin unit you mentioned cannot be disassembled into phones? How to align plbert and text input?

hermanseu · 2023-11-29T03:06:30Z

@Moonmore
I used the same pinyin phonemes(sheng1 mu3 yun4 mu3) to train all the models. But when training asr, I used the phonemes without tones. if the pinyin uint cannot be disassembled, maybe the pinyin can be regard as an phoneme.

@zhouyong64 Sorry for the wrong information of yesterday, I tained PL-BERT with tones, and trained asr without tones.

I trained Chinese PL-BERT without pinyin tones. But maybe PL-BERT with tones will also work normally, so you can try.

Moonmore · 2023-11-29T03:22:22Z

@Moonmore I used the same pinyin phonemes(sheng1 mu3 yun4 mu3) to train all the models. But when training asr, I used the phonemes without tones. if the pinyin uint cannot be disassembled, maybe the pinyin can be regard as an phoneme.

@zhouyong64 Sorry for the wrong information of yesterday, I tained PL-BERT with tones, and trained asr without tones.

I trained Chinese PL-BERT without pinyin tones. But maybe PL-BERT with tones will also work normally, so you can try.

So can I understand that all text-related models are trained using the same phoneme unit, and the characteristics of each minimum pronunciation modeling unit are obtained. like(ni3 hao3 -> n i3 h ao3), The input length is 4, and the output length of the model is also 4. text encoder and the bert model. and how to construct the plbert label?

hermanseu · 2023-11-29T05:35:29Z

@Moonmore
Yes, the output lengths of text encoder and bert are same as input lengths.
About plbert label, you can read the logic of dataloader.py in plbert repo. It explained clearly.

Moonmore · 2023-11-29T06:38:47Z

@Moonmore Yes, the output lengths of text encoder and bert are same as input lengths. About plbert label, you can read the logic of dataloader.py in plbert repo. It explained clearly.

@hermanseu Thank you for your reply.

yl4579 closed this as completed Nov 24, 2023

Repository owner locked and limited conversation to collaborators Nov 29, 2023

yl4579 converted this issue into discussion #111 Nov 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Can I train the Chinese model? #70

Can I train the Chinese model? #70

Tsangchi-Lam commented Nov 23, 2023

Kreevoz commented Nov 23, 2023 •

edited

Loading

yl4579 commented Nov 24, 2023

yl4579 commented Nov 24, 2023

hermanseu commented Nov 24, 2023

yl4579 commented Nov 24, 2023

zhouyong64 commented Nov 24, 2023

hermanseu commented Nov 24, 2023

Moonmore commented Nov 27, 2023

hermanseu commented Nov 28, 2023

zhouyong64 commented Nov 28, 2023

hermanseu commented Nov 28, 2023

zhouyong64 commented Nov 28, 2023

hermanseu commented Nov 29, 2023

Moonmore commented Nov 29, 2023

hermanseu commented Nov 29, 2023

Moonmore commented Nov 29, 2023

hermanseu commented Nov 29, 2023

Moonmore commented Nov 29, 2023

This issue was moved to a discussion.

This issue was moved to a discussion.

Can I train the Chinese model? #70

Can I train the Chinese model? #70

Comments

Tsangchi-Lam commented Nov 23, 2023

Kreevoz commented Nov 23, 2023 • edited Loading

yl4579 commented Nov 24, 2023

yl4579 commented Nov 24, 2023

hermanseu commented Nov 24, 2023

yl4579 commented Nov 24, 2023

zhouyong64 commented Nov 24, 2023

hermanseu commented Nov 24, 2023

Moonmore commented Nov 27, 2023

hermanseu commented Nov 28, 2023

zhouyong64 commented Nov 28, 2023

hermanseu commented Nov 28, 2023

zhouyong64 commented Nov 28, 2023

hermanseu commented Nov 29, 2023

Moonmore commented Nov 29, 2023

hermanseu commented Nov 29, 2023

Moonmore commented Nov 29, 2023

hermanseu commented Nov 29, 2023

Moonmore commented Nov 29, 2023

This issue was moved to a discussion.

Kreevoz commented Nov 23, 2023 •

edited

Loading