Divide and Remaster v3 is a multilingual rework of the Divide and Remaster v2 dataset by Pétermann et al.
The major changes from DnR v2 are as follows:
- the dialogue stem now contains content from more than 30 languages across various language families;
- speech, vocals, and/or vocalizations have been removed from the music and effects stems;
- loudness and timing parametrization have been adjusted to approximate the distributions of real cinematic content;
- the mastering process now preserves relative loudness between stems and approximates standard industry practices.
See wiki for instructions on using this dataset.
For the model trained on DnR v3, go here
The source code for recreating the dataset will be made available in this repository by September. (You can imagine that pulling data from 40 different datasets leads to a very messy source code 😭. I'm trying to standardize it so that it's easier to develop on in future iterations)
Divide and Remaster v3 is released under the CC BY-SA 4.0 license. See wiki for full license information of each source dataset.