pipeline.to(torch.device("cuda")) not working on T4 Tesla GPU (pyannote==3.0.0) #1475

guilhermehge · 2023-09-26T18:35:25Z

I've been testing out today the new pyannote 3.0.0 but it seems that adding

import torch pipeline.to(torch.device("cuda"))

to my code does not allocate the pipeline to the GPU anymore.

I have tried the following:

device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # returns cuda
pipeline = Pipeline.from_pretrained(
    'pyannote/speaker-diarization-3.0',
    use_auth_token = 'key'
).to(device)

##########

device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # returns cuda
pipeline = Pipeline.from_pretrained(
    'pyannote/speaker-diarization-3.0',
    use_auth_token = 'key'
).to(torch.device(device))

##########

pipeline = Pipeline.from_pretrained(
    'pyannote/speaker-diarization-3.0',
    use_auth_token = 'key'
)

pipeline = pipeline.to(torch.device("cuda:0"))

But nothing seems to work.

When I type pipeline.device after applying the configuration, it returns device(type='cuda'), but it is still not using it. This is what the nvidia-smi returns WHILE THE PIPELINE IS RUNNING:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000001:00:00.0 Off |                  Off |
| N/A   43C    P0    27W /  70W |    855MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Colab notebook to reproduce the issue (MRE): https://colab.research.google.com/drive/16zpDvNa5fUs8a_r-d-DxbdAdLHEPrgta?usp=sharing

PS: this was working with the Interspeech and 2022.07 checkpoints and the previous version.

Edit: I did some testing and the problem seems to be the embedding model, I tried using the "speechbrain/spkrec-ecapa-voxceleb" embedding model by editing the config.yaml file and, in this case, the GPU was properly used.

The text was updated successfully, but these errors were encountered:

github-actions · 2023-09-26T18:35:47Z

Thank you for your issue.You might want to check the FAQ if you haven't done so already.

Feel free to close this issue if you found an answer in the FAQ.

If your issue is a feature request, please read this first and update your request accordingly, if needed.

If your issue is a bug report, please provide a minimum reproducible example as a link to a self-contained Google Colab notebook containing everthing needed to reproduce the bug:

installation
data preparation
model download
etc.

Providing an MRE will increase your chance of getting an answer from the community (either maintainers or other power users).

Companies relying on pyannote.audio in production may contact me via email regarding:

paid scientific consulting around speaker diarization and speech processing in general;
custom models and tailored features (via the local tech transfer office).

This is an automated reply, generated by FAQtory

realfolkcode · 2023-09-27T10:06:49Z

I am facing the same issue. If I understand correctly, this is due to speaker embedding model running with onnx (credits to #1476). But I am not sure that reverting back to speechbrain/spkrec-ecapa-voxceleb would result in better quality. @hbredin, could you please guide us on how speaker embbeding model affect the quality of diarization? And how 3.0 with old embedding model would perform against 2.1?

hbredin · 2023-09-27T10:26:08Z

Switching back to speechbrain/spkrec-ecapa-voxceleb would result in:

same missed detection and false alarm rates as 3.0 (which are much better than 2.1)
speaker confusion rate between that of 2.1 and 3.0 (but you would have to tune the clustering threshold hyper-parameter.

hbredin · 2023-09-27T10:52:19Z

Could you try with this and let me know if that allows to run on GPU on your side?

pip install https://github.com/pyannote/pyannote-audio/archive/refs/heads/fix/onnxruntime-gpu.zip

All it does is switch from onnxruntime to onnxruntime-gpu which does seem to also support CPU

guilhermehge · 2023-09-27T14:22:09Z

Just to clarify, I only switched to the older embedding model to check if the problem was the newer model, just for diagnostics.

gau-nernst · 2023-09-29T01:25:28Z

I notice that 'pyannote/speaker-diarization-3.0' is quite slower than 'pyannote/speaker-diarization', even with the GPU fix. Does anyone observe the same phenomenon? I will get some sample benchmark code when I have time.

hbredin · 2023-09-29T06:23:01Z

Hey @gau-nernst, I opened a related issue here #1481. Please continue the discussion there.

hbredin · 2023-11-09T12:05:17Z

FYI: #1537

hbredin · 2023-11-16T13:04:42Z

Latest version no longer relies on ONNX runtime.
Please update to pyannote.audio 3.1 and pyannote/speaker-diarization-3.1 (and open new issues if needed).

hbredin mentioned this issue Sep 27, 2023

fix: fix WeSpeakerPretrainedSpeakerEmbedding GPU support #1478

Merged

hbredin closed this as completed in #1478 Sep 28, 2023

hbredin mentioned this issue Sep 29, 2023

pyannote/speaker-diarization-3.0 slower than pyannote/speaker-diarization? #1481

Closed

hbredin mentioned this issue Nov 9, 2023

Get rid of ONNX WeSpeaker in favor of its pytorch implementation #1537

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pipeline.to(torch.device("cuda")) not working on T4 Tesla GPU (pyannote==3.0.0) #1475

pipeline.to(torch.device("cuda")) not working on T4 Tesla GPU (pyannote==3.0.0) #1475

guilhermehge commented Sep 26, 2023 •

edited

Loading

github-actions bot commented Sep 26, 2023

realfolkcode commented Sep 27, 2023 •

edited

Loading

hbredin commented Sep 27, 2023 •

edited

Loading

hbredin commented Sep 27, 2023

guilhermehge commented Sep 27, 2023

gau-nernst commented Sep 29, 2023

hbredin commented Sep 29, 2023

hbredin commented Nov 9, 2023

hbredin commented Nov 16, 2023

pipeline.to(torch.device("cuda")) not working on T4 Tesla GPU (pyannote==3.0.0) #1475

pipeline.to(torch.device("cuda")) not working on T4 Tesla GPU (pyannote==3.0.0) #1475

Comments

guilhermehge commented Sep 26, 2023 • edited Loading

github-actions bot commented Sep 26, 2023

realfolkcode commented Sep 27, 2023 • edited Loading

hbredin commented Sep 27, 2023 • edited Loading

hbredin commented Sep 27, 2023

guilhermehge commented Sep 27, 2023

gau-nernst commented Sep 29, 2023

hbredin commented Sep 29, 2023

hbredin commented Nov 9, 2023

hbredin commented Nov 16, 2023

guilhermehge commented Sep 26, 2023 •

edited

Loading

realfolkcode commented Sep 27, 2023 •

edited

Loading

hbredin commented Sep 27, 2023 •

edited

Loading