Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: enable collecting CPU profiles at a lower rate to limit the CPU overhead of the profiling #75801

Open
knz opened this issue Feb 1, 2022 · 4 comments
Labels
A-cli-server CLI commands that pertain to CockroachDB server processes C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-observability

Comments

@knz
Copy link
Contributor

knz commented Feb 1, 2022

We'd like to dump periodic CPU profiles on every node. Or at least when CPU usage increases with spikes. That is, we'd like to reuse a similar logic at the one we already use to collect heap dumps and goroutine dumps (#75799).

Unfortunately, the pprof default profile rate (100Hz) is causing a noticeable (1-2%) performance dip.
Given that we usually need profiles when CPU is overloaded, the additional cost due to profiling is unwelcome.

So we'd like to explore a way to collect profiles at a lower sampling rate, to lower the overhead.

Sadly, the code in pprof.StartCPUProfile() which we currently use, hardcodes the rate at 100Hz.

We haven't yet found another way to do this short of forking pprof.

Jira issue: CRDB-12842

@knz knz added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-cli-server CLI commands that pertain to CockroachDB server processes T-observability-inf labels Feb 1, 2022
@knz
Copy link
Contributor Author

knz commented Feb 1, 2022

We're going to add this as one more item that can motivate a custom go runtime extension.

@tbg
Copy link
Member

tbg commented Feb 1, 2022

related: #75799

@knz
Copy link
Contributor Author

knz commented Feb 1, 2022

@felixge do you happen to have ideas of APIs we can use short of forking pprof?

@felixge
Copy link

felixge commented Mar 19, 2022

@knz hey, sorry for the late reply, I'm digging my way out of a huge email backlog right now 🙈.

You should be able to call runtime.SetCPUProfileRate(10) for reducing the sampling rate by 10x. Calling pprof.StartCPUProfile() afterwards will print a warning, but the warning is wrong and can be ignored. The requested sample rate should still work.

See golang/go#42502 for more details and upcoming changes to this API

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cli-server CLI commands that pertain to CockroachDB server processes C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-observability
Projects
None yet
Development

No branches or pull requests

3 participants