Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the purpose of the Resegmentation and AdaptiveVoiceActivityDetection Pipeline? #1700

Open
asusdisciple opened this issue Apr 30, 2024 · 1 comment

Comments

@asusdisciple
Copy link

Tested versions

  • Reproduced in 3.1.0

System information

Ubuntu 22.04, Lenovo P1 Gen 5 Workstation A4500

Issue description

I wanted to improve my segmentation with Pyannote, since most of my segments are very long when the same person is talking. Since min_duration_off is already set to 0.0, I looked through the code and found the classes Resegmentation and AdaptiveVoiceActivityDetection.

I thought by applying one of those methods I would be able to get shorter segments, however it seems the code is not working. Is this legacy code or should it work?

For AdaptiveVoiceActivityDetection I get the error:

  File "/home/.../PycharmProjects/..../venv/lib/python3.10/site-packages/pyannote/audio/pipelines/voice_activity_detection.py", line 313, in apply
    vad_pipeline = VoiceActivityDetection("vad").instantiate(
  File "/home/.../PycharmProjects/..../venv/lib/python3.10/site-packages/pyannote/audio/pipelines/voice_activity_detection.py", line 123, in __init__
    model = get_model(segmentation, use_auth_token=use_auth_token)
  File "/home/.../PycharmProjects/.../venv/lib/python3.10/site-packages/pyannote/audio/pipelines/utils/getter.py", line 89, in get_model
    model.eval()
AttributeError: 'NoneType' object has no attribute 'eval'

Could not download 'vad' model.

I initialize the model like this:

self.vad = AdaptiveVoiceActivityDetection(MODEL_PATH_SEG)
self.vad.instantiate({"num_epochs": 1, "batch_size": settings.BATCH_SIZE_SEG, "learning_rate": 0.1})
self.vad.to(self.device)
#call
va = self.vad(tensor_audio_mapping)

For me it seems to be like the instantiation is hardcoded (line 313) and the model key "vad" can not be found?

For Resegmentation I get the error, however I can not see what is wrong my way of instantion since it works, for example in case of SpeakerDiarization:

  File "/home/.../PycharmProjects/.../pyannote_service.py", line 125, in diarize
    reseg = self.resegmentation_model(file=tensor_audio_mapping, diarization=diarization)
  File "/home/.../PycharmProjects/.../venv/lib/python3.10/site-packages/pyannote/audio/core/pipeline.py", line 304, in __call__
    raise RuntimeError(
RuntimeError: A pipeline must be instantiated with `pipeline.instantiate(parameters)` before it can be applied.

I initialize the model like this:

 self.resegmentation_model = Resegmentation(segmentation=MODEL_PATH_SEG)
 self.resegmentation_model.instantiate(co["params"])
 self.resegmentation_model.to(self.device)
# diarization is the diarization object produced by a speaker_d pipeline from pyannote
reseg = self.resegmentation_model(file=tensor_audio_mapping, diarization=diarization)

where co refers to a yaml which looks like this:

version: 3.1.0
pipeline:
  name: pyannote.audio.pipelines.SpeakerDiarization
  params:
    clustering: AgglomerativeClustering
    embedding: models/pyannote/pyannote_embedding.bin
    embedding_batch_size: 32
    embedding_exclude_overlap: true
    segmentation: models/pyannote/pyannote_segmentation.bin
    segmentation_batch_size: 32

params:
    min_duration_off: 0.0

Would appreciate any insights I might have missed out on or just a short clarification if the code is not intended for usage.

Minimal reproduction example (MRE)

Can be found in my example above

@asusdisciple
Copy link
Author

Okay for the Resegmentation pipeline it seems to be that it does not work with pyannote/segmentation-3.0. But it does work with pyannote/segmentation, which unfortunately gives me a a few warnings:

Just wanted to let you know.

Best regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant