Skip to content

Commit

Permalink
add dataset preview & update link for pre-train model.
Browse files Browse the repository at this point in the history
  • Loading branch information
MoonInTheRiver committed Jan 17, 2022
1 parent b54d0ef commit db7c8b8
Showing 1 changed file with 8 additions and 7 deletions.
15 changes: 8 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ or pip install -r requirements_3090.txt (GPU 3090, CUDA 11.4)

a) Download and extract the [LJ Speech dataset](https://keithito.com/LJ-Speech-Dataset/), then create a link to the dataset folder: `ln -s /xxx/LJSpeech-1.1/ data/raw/`

b) Download and Unzip the [ground-truth duration](https://drive.google.com/file/d/1SqwIISwaBZDiCW1MHTHx-MKX6_NQJ_f4/view?usp=sharing) extracted by [MFA](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/releases/download/v1.0.1/montreal-forced-aligner_linux.tar.gz): `tar -xvf mfa_outputs.tar; mv mfa_outputs data/processed/ljspeech/`
b) Download and Unzip the [ground-truth duration](https://github.com/MoonInTheRiver/DiffSinger/releases/download/pre-release/mfa_outputs.tar) extracted by [MFA](https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/releases/download/v1.0.1/montreal-forced-aligner_linux.tar.gz): `tar -xvf mfa_outputs.tar; mv mfa_outputs data/processed/ljspeech/`

c) Run the following scripts to pack the dataset for training/inference.

Expand All @@ -59,9 +59,9 @@ CUDA_VISIBLE_DEVICES=0 python tasks/run.py --config usr/configs/lj_ds_beta6.yaml
```

We also provide:
- the pre-trained model of [DiffSpeech](https://drive.google.com/file/d/1AHRuNS379v2_lNuz4-Mjlpii7TZsfs3f/view?usp=sharing);
- the pre-trained model of [HifiGAN](https://drive.google.com/file/d/1Z3DJ9fvvzIci9DAf8jwchQs-Ulgpx6l8/view?usp=sharing) vocoder;
- the individual pre-trained model of [FastSpeech 2](https://drive.google.com/file/d/1Zp45YjKkkv5vQSA7woHIqEggfyLqQdqs/view?usp=sharing) for the shallow diffusion mechanism in DiffSpeech;
- the pre-trained model of [DiffSpeech](https://github.com/MoonInTheRiver/DiffSinger/releases/download/pre-release/lj_ds_beta6_1213.zip);
- the pre-trained model of [HifiGAN](https://github.com/MoonInTheRiver/DiffSinger/releases/download/pre-release/0414_hifi_lj_1.zip) vocoder;
- the individual pre-trained model of [FastSpeech 2](https://github.com/MoonInTheRiver/DiffSinger/releases/download/pre-release/fs2_lj_1.zip) for the shallow diffusion mechanism in DiffSpeech;

Remember to put the pre-trained models in `checkpoints` directory.

Expand All @@ -72,6 +72,7 @@ About the determination of 'k' in shallow diffusion: We recommend the trick intr

### 0. Data Acquirement
- See in [apply_form](https://github.com/MoonInTheRiver/DiffSinger/blob/master/resources/apply_form.md).
- Dataset [preview](https://github.com/MoonInTheRiver/DiffSinger/releases/download/pre-release/popcs_preview.zip).

### 1. Data Preparation
a) Download and extract PopCS, then create a link to the dataset folder: `ln -s /xxx/popcs/ data/processed/`
Expand All @@ -96,9 +97,9 @@ CUDA_VISIBLE_DEVICES=0 python tasks/run.py --config usr/configs/popcs_ds_beta6_o
```

We also provide:
- the pre-trained model of [DiffSinger](https://drive.google.com/file/d/1QEXcvhhiUiHEK2ItXZ8EDHwv8bawiaIX/view?usp=sharing);
- the pre-trained model of [FFT-Singer](https://drive.google.com/file/d/1XRCdkI8B-DkRe8NfUJqgSjM-9c0gXQvJ/view?usp=sharing) for the shallow diffusion mechanism in DiffSinger;
- the pre-trained model of [HifiGAN-Singing](https://drive.google.com/file/d/1Z9bH3vorM34gBbjBlGGWWGVl4PwYy3YY/view?usp=sharing) which is specially designed for SVS with NSF mechanism.
- the pre-trained model of [DiffSinger](https://github.com/MoonInTheRiver/DiffSinger/releases/download/pre-release/popcs_ds_beta6_offline_pmf0_1230.zip);
- the pre-trained model of [FFT-Singer](https://github.com/MoonInTheRiver/DiffSinger/releases/download/pre-release/popcs_fs2_pmf0_1230.zip) for the shallow diffusion mechanism in DiffSinger;
- the pre-trained model of [HifiGAN-Singing](https://github.com/MoonInTheRiver/DiffSinger/releases/download/pre-release/0109_hifigan_bigpopcs_hop128.zip) which is specially designed for SVS with NSF mechanism.

*Note that:*

Expand Down

0 comments on commit db7c8b8

Please sign in to comment.