Name		Name	Last commit message	Last commit date
parent directory ..
.gitignore		.gitignore
README.md		README.md
database.yml		database.yml
generate_uems.py		generate_uems.py
generate_uris.py		generate_uris.py
setup.sh		setup.sh
split_rttm.py		split_rttm.py

README.md

MSDWild for Pyannote

This repository automatically downloads the MSDWILD dataset and set it up to be used with pyannote-database.

It will generate two subsets from the original few.train set : custom1_train and custom1_dev, as the original dataset only has training and test data. Defaults are 6h for custom1_dev, and what's left (~60h) for custom1_train.

Out-of-the-box protocol for pyannote.audio training is MSDWILD.SpeakerDiarization.CustomFew.

Instructions

Clone this repository, download the dataset zip at https://github.com/X-LANCE/MSDWILD#wavs and put it under the msdwild folder. Then, run setup.sh in the msdwild directory to download/extract/generate the files (wav, rttm, uem, uris).

Original sets info

subset	# files	total length
few.train	2476	65h54m
few.val	490	9h49m
many.val	177	4h04

Credits

MSDWild

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

msdwild

msdwild

README.md

MSDWild for Pyannote

Instructions

Original sets info

Credits

Files

msdwild

Directory actions

More options

Directory actions

More options

Latest commit

History

msdwild

Folders and files

parent directory

README.md

MSDWild for Pyannote

Instructions

Original sets info

Credits