Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GPU stats to the /stats API and debug screen #3931

Merged
merged 71 commits into from
Nov 29, 2022

Conversation

NickM-27
Copy link
Collaborator

@NickM-27 NickM-27 commented Sep 25, 2022

Add the ability to view current GPU utilization and GPU memory utilization from hwaccel args.

Unfortunately each GPU API has a different set of features so there will be some difference (for example nvidia-smi can return the model GPU).

To - Do:

  • Add support for intel_gpu_top
  • Look in to supporting RPi hwaccel stats (is it possible)
  • Add tests for each specific GPU check

Screenshot 2022-11-13 at 2 50 15 PM

@NickM-27
Copy link
Collaborator Author

NickM-27 commented Sep 25, 2022

Added intel_gpu_top for now, need to verify if it can work in the container: crzynik/frigate:gpu-stats

@netlify
Copy link

netlify bot commented Nov 2, 2022

Deploy Preview for frigate-docs canceled.

Name Link
🔨 Latest commit 98adac4
🔍 Latest deploy log https://app.netlify.com/sites/frigate-docs/deploys/6385255880f4fe000871205d

@NickM-27
Copy link
Collaborator Author

Here's an example of what will happen if there is an error getting gpu stats

Screen Shot 2022-11-15 at 08 55 34 AM

frigate/http.py Outdated Show resolved Hide resolved
"""Get stats for cpu / gpu."""

async def run_tasks() -> None:
await asyncio.wait(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the only reason to introduce asyncio stuff here so we can wait for these to run in parallel?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, with the cpu and 2 gpus my production server takes ~ 7-10 seconds and with this it's more like 3-5 seconds. There definitely may be better ways to parallelize it or run the actual checks in the background and cache / average the returned results so when /stats is called it can return without waiting.

@NickM-27 NickM-27 force-pushed the gpu-stats branch 2 times, most recently from 4b3e124 to fa7a362 Compare November 26, 2022 14:01
@blakeblackshear blakeblackshear merged commit aaedd24 into blakeblackshear:dev Nov 29, 2022
@NickM-27 NickM-27 deleted the gpu-stats branch November 29, 2022 01:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: Better debug/logging/notification from ffmpeg - specifically around hardware acceleration
2 participants