-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime/pprof: NewCPUProfile + cpuProfile.Start to allow profile configuration #42502
Comments
runtime/pprof: method StartCPUProfile adds a parameter to allow custom cpu rate.
Change https://golang.org/cl/269437 mentions this issue: |
so we can add method |
I don't think that is the best name, but, yes, something along those lines could work. |
ok, let me think about it |
runtime/pprof: method StartCPUProfile adds a parameter to allow custom cpu rate.
runtime/pprof: method StartCPUProfile adds a parameter to allow custom cpu rate.
Change https://golang.org/cl/269537 mentions this issue: |
I turned this issue into a proposal for the API change. |
@ianlancetaylor Thanks for turning this into a proposal. The current proposal is specific to
|
@hyangah It sounds like you are suggesting some kind of
or even
? |
It sounds like maybe the interface in the previous message is the answer if we want to solve this problem. |
@rsc Choosing the right profiling rate is hard. There are many theories and different opinion around choosing the right value - http://www.brendangregg.com/blog/2014-06-22/perf-cpu-sample.html is talking about a lockstep sampling issue, #35057 is discussing scaling the frequency based on the system's capability because 100hz is sometimes too high, and @rogerlucena wants higher than 100hz (aside from whether 800 hz is reasonable or not). It looks to me it's a parameter people still want to experiment and tune. |
@rsc For my application it was crucial to set a sampling rate higher than 100 Hz to see something useful in the profiling output, as I explained and detailed in the Issue description above. Apart from that, in this description I also referenced other people for which customizing the sampling rate can still be useful: stack overflow and google groups discussion. |
One question I haven't seen brought up is what happens if the user attempts to create multiple A simple answer would be that you are only allowed to have one running profile at a time, so the second call to Alternatively, we could support multiple concurrent profiles. The main use case I see for that is to enable an always-running, low-frequency, fleet-wide profile without blocking the ability to capture a short, high-frequency profile. In the runtime, we would probably implement this by using the highest profile frequency at any given time for all profiles. We could optionally sub-sample that for lower rate profiles to prevent unexpectedly large profile data. The pprof format supports variable-weight samples, so sampling at a higher rate than requested is not a problem from a format perspective. |
Given proposals to extend |
I agree that CPUProfile may not be the best name. For PMU events such as "branch mispredicts", "cycles" and "cache misses", if we use the period as the trigger condition, then I think a core property of them is that we sample once the event occurs a certain number of times. For os-timer, sampling is performed at regular intervals. Therefore, it is difficult for CPUProfile to cover these two different properties. And from the perspective of profiling the execution state of the CPU, perhaps CPUProfile is a good name. But it would be best if there was a better name. |
Since we're in the freeze and this is work for a new API, I don't think anything is going to happen here. I'm moving this into Go 1.21. This has been sitting here for quite a long time, and at this point I think we just need someone to implement it. Marking as help wanted. |
The follow-up discussion on adding PMU configuration has some effects on the API here. It adds a Bringing these together and working out a few more details, I think the API for this issue is:
Does this seem right? Some of this documentation would be generalized when and if we add PMU events, but for this issue I wanted to keep it more specific. |
The https://github.com/golang/go/issues/53286 also adds the following method.
I can do it, and I have already sent out a prototype for these two proposals, see https://go-review.googlesource.com/c/go/+/410799/9 |
Change https://go.dev/cl/410797 mentions this issue: |
It seems this didn't quite make it for Go 1.21. Since it's an accepted proposal we should probably do something about it in the next dev cycle. Kicking to Go 1.22. |
Just adding my 2c in favor of making the rate configurable: if the rate is hardcoded to 100Hz, this does not stop real world CPUs from getting faster. The given profiling/test run that managed to capture some samples in the past may start getting fewer samples, or even no samples at all when run on faster and faster CPUs - eventually the time passing between I just dealt with this issue on a pretty high-powered server, and had to set the rate to 500Hz in order to capture just 3 samples from a run with relatively complex logic. I think that maybe Hz isn't the right unit for controlling sampling rate, since it's absolute. Something relative to CPU speed (e.g. number of CPU cycles) might be more appropriate. |
The tree is frozen for Go 1.23. Should we move this to Go 1.24, @mknyszek ?
On Linux, the current approach of asking the kernel to track on-CPU time means we're limited by its tick frequency. Often that's set to 250 Hz, though some sites configure it to be 1000 Hz (or 100 Hz). See #35057 for what it can look like when a process (or now, a thread) earns more than one sample per kernel tick. Sampling at faster than kernel's tick interval will require a different approach, not just a config change. (It would be neat to use
But also on Linux, we now set the initial expiration of the timer to be uniformly distributed between 0 and 10 ms rather than fixed at 10 ms. That provides a fair chance of getting a sample even for very short profiling windows or very light workloads (assuming the timing of the work isn't correlated with the kernel's timer interrupt). So it should be safe and fair to collect several profiles (or thousands, if you like) and merge them to get a more detailed view. |
The PR google/badwolf#155 referenced by OP also helped me to reveal a problem that wasn't showing up with the hard-coded profiling. This change, along with possible future support for the PMU, is very welcome. |
Originally posted by @cherrymui in #40094 (comment)
The pprof package indeed does not allow us to customize the CPU profiling rate at the moment. This was something critical for my application in google/badwolf#155, since the default profiling rate of 100 was extremely low and did not allow me no see anything in my CPU profile.
To work around this, I had to make a call to
runtime.SetCPUProfileRate(800)
(with a rate of around 800 hz to see something meaningful and helpful in my CPU profile) before callingpprof.StartCPUProfile
. Since there is a hard-coded call toruntime.SetCPUProfileRate(100)
inside thepprof.StartCPUProfile
, a warning is being printed to my terminal at the moment saying "runtime: cannot set cpu profile rate until previous profile has finished" as this call toruntime.SetCPUProfileRate(100)
failed (the booleancpuprof.on
was already set to true in the previous call toruntime.SetCPUProfileRate(800)
). This warning is annoying since it is misleading for someone who does not know the internals ofpprof.StartCPUProfile
, but the important here is that the CPU profiling is done with the first profiling rate that I manually set (800, as I wanted and as wanted from the user point of view).Given this, what I am trying to say is that the public/exported method
runtime.SetCPUProfileRate
is still useful, and sometimes even critical for allowing us to use wellpprof
and to see something relevant on our profiles, depending on the application. I am not alone on this as you can see here (stack overflow) and here (google groups discussion). Even one principal software engineer (L8) I have been working with at Google found it useful when reviewing my PR google/badwolf#155 above and copied the code to use it in one of his applications, on which setting a higher profiling rate was critical as well.That said, I believe we could, from now on, follow three different paths:
The ideal would be for the
pprof.StartCPUProfile
method already receive as argument the CPU profile rate we want, allowing us to customize it, with this argument having its default value set to 100 if it was not specified by the function call. This would be the ideal, if we had support for default arguments in Go. But, we can achieve something similar if we had a similar method with signaturepprof.StartCPUProfileWithSpecifiedRate(w io.Writer, hz int)
, in addition to thepprof.StartCPUProfile(w io.Writer)
method we already have. This way, theruntime.SetCPUProfileRate
method would not necessarily need to be an exported method anymore.We could keep
runtime.SetCPUProfileRate
public/exported, but we would have to fix the problem with the misleading warning message "runtime: cannot set cpu profile rate until previous profile has finished" if we had already manually set the profiling rate before the call topprof.StartCPUProfile
. For this we could:2.1. Have an auxiliar boolean method
runtime.isCPUProfileRateAlreadySet
, and call it insidepprof.StartCPUProfile
before the hard-coded call toruntime.SetCPUProfileRate(100)
there, trying to set the profiling rate to 100 only if this rate was not already previously set. This way, we would not see any strange warning after a call topprof.StartCPUProfile
if we had already previously set the profiling rate manually. I like this solution because it makes minimal changes on your code, it does not change any signature and it resolves exactly the problem we are trying to solve here, while keeping the warning message for other calls toruntime.SetCPUProfileRate
that were not frompprof.StartCPUProfile
in the situation above.2.2. Make
runtime.SetCPUProfileRate
return an error in the case the profiling rate was already previously set, allowing the caller to handle it and decide what to do from then on (insidepprof.StartCPUProfile
we could ignore this error for example). This would change the signature ofruntime.SetCPUProfileRate
.We could move the
SetCPUProfileRate
public signature to thepprof
package (with some changes as explained above to avoid the misleading warning message), and deprecate it / make it unexported inside theruntime
package.These were just some of the alternatives that I thought about for now. I would like to hear more from you on what you think about them and if you have any other suggestions for fixing what I explained above.
The text was updated successfully, but these errors were encountered: