MagVITS

VITS with phoneme-level prosody modeling based on MaskGIT （WIP）

feature: inference speed ~= bert-vits2 & prosody > bert-vits2 (maybe)

目前代码正在重构，可能还跑不通，目前不建议跑中文预训练模型不久后会上传（数据：原神中文+aishell 共200h多一些）

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
asr		asr
configs		configs
img		img
module		module
pretrain		pretrain
text		text
transformer		transformer
README.md		README.md
asr_train.py		asr_train.py
data_conf.py		data_conf.py
extract_duration.py		extract_duration.py
extract_spk_embedding.py		extract_spk_embedding.py
extract_ssl.py		extract_ssl.py
gen_filelist.py		gen_filelist.py
gen_phonemes.py		gen_phonemes.py
inference.py		inference.py
requirements.txt		requirements.txt
resample.py		resample.py
utils.py		utils.py
vits_train.py		vits_train.py

Provide feedback