Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pipeline.to(torch.device("cuda")) not working on T4 Tesla GPU (pyannote==3.0.0) #1475

Closed
guilhermehge opened this issue Sep 26, 2023 · 9 comments · Fixed by #1478
Closed

Comments

@guilhermehge
Copy link

guilhermehge commented Sep 26, 2023

I've been testing out today the new pyannote 3.0.0 but it seems that adding

import torch pipeline.to(torch.device("cuda"))

to my code does not allocate the pipeline to the GPU anymore.

I have tried the following:

device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # returns cuda
pipeline = Pipeline.from_pretrained(
    'pyannote/speaker-diarization-3.0',
    use_auth_token = 'key'
).to(device)

##########

device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # returns cuda
pipeline = Pipeline.from_pretrained(
    'pyannote/speaker-diarization-3.0',
    use_auth_token = 'key'
).to(torch.device(device))

##########

pipeline = Pipeline.from_pretrained(
    'pyannote/speaker-diarization-3.0',
    use_auth_token = 'key'
)

pipeline = pipeline.to(torch.device("cuda:0"))

But nothing seems to work.

When I type pipeline.device after applying the configuration, it returns device(type='cuda'), but it is still not using it. This is what the nvidia-smi returns WHILE THE PIPELINE IS RUNNING:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000001:00:00.0 Off |                  Off |
| N/A   43C    P0    27W /  70W |    855MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Colab notebook to reproduce the issue (MRE): https://colab.research.google.com/drive/16zpDvNa5fUs8a_r-d-DxbdAdLHEPrgta?usp=sharing

PS: this was working with the Interspeech and 2022.07 checkpoints and the previous version.

Edit: I did some testing and the problem seems to be the embedding model, I tried using the "speechbrain/spkrec-ecapa-voxceleb" embedding model by editing the config.yaml file and, in this case, the GPU was properly used.

@github-actions
Copy link

Thank you for your issue.You might want to check the FAQ if you haven't done so already.

Feel free to close this issue if you found an answer in the FAQ.

If your issue is a feature request, please read this first and update your request accordingly, if needed.

If your issue is a bug report, please provide a minimum reproducible example as a link to a self-contained Google Colab notebook containing everthing needed to reproduce the bug:

  • installation
  • data preparation
  • model download
  • etc.

Providing an MRE will increase your chance of getting an answer from the community (either maintainers or other power users).

Companies relying on pyannote.audio in production may contact me via email regarding:

  • paid scientific consulting around speaker diarization and speech processing in general;
  • custom models and tailored features (via the local tech transfer office).

This is an automated reply, generated by FAQtory

@realfolkcode
Copy link

realfolkcode commented Sep 27, 2023

I am facing the same issue. If I understand correctly, this is due to speaker embedding model running with onnx (credits to #1476). But I am not sure that reverting back to speechbrain/spkrec-ecapa-voxceleb would result in better quality. @hbredin, could you please guide us on how speaker embbeding model affect the quality of diarization? And how 3.0 with old embedding model would perform against 2.1?

@hbredin
Copy link
Member

hbredin commented Sep 27, 2023

Switching back to speechbrain/spkrec-ecapa-voxceleb would result in:

  • same missed detection and false alarm rates as 3.0 (which are much better than 2.1)
  • speaker confusion rate between that of 2.1 and 3.0 (but you would have to tune the clustering threshold hyper-parameter.

@hbredin
Copy link
Member

hbredin commented Sep 27, 2023

Could you try with this and let me know if that allows to run on GPU on your side?

pip install https://github.com/pyannote/pyannote-audio/archive/refs/heads/fix/onnxruntime-gpu.zip

All it does is switch from onnxruntime to onnxruntime-gpu which does seem to also support CPU

@guilhermehge
Copy link
Author

Just to clarify, I only switched to the older embedding model to check if the problem was the newer model, just for diagnostics.

@gau-nernst
Copy link

I notice that 'pyannote/speaker-diarization-3.0' is quite slower than 'pyannote/speaker-diarization', even with the GPU fix. Does anyone observe the same phenomenon? I will get some sample benchmark code when I have time.

@hbredin
Copy link
Member

hbredin commented Sep 29, 2023

Hey @gau-nernst, I opened a related issue here #1481. Please continue the discussion there.

@hbredin
Copy link
Member

hbredin commented Nov 9, 2023

FYI: #1537

@hbredin
Copy link
Member

hbredin commented Nov 16, 2023

Latest version no longer relies on ONNX runtime.
Please update to pyannote.audio 3.1 and pyannote/speaker-diarization-3.1 (and open new issues if needed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants