-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add perf exporter #1274
Add perf exporter #1274
Conversation
f17994f
to
0468a0c
Compare
I'm going to have to dig deeper into the new included libraries, but it looks like so far everything is being returned as gauge values. We prefer to expose raw underlying counters in Prometheus, rather than try to use pre-calculated rates. Is this possible with perf? |
Yeah, most of the underlying metrics are counters and I just copy pasted some of the other exporter code to get this started. I think most of the metrics will be mapped to counters. The one thing I need to figure out is how are overflows handled normally for other exporters. |
For counters, they should all be named If you have concerns about 2^53 uint64 overflows, take a look at how we handle it in the snmp_exporter. |
Pushed up changes, I think this should be good if you want to test it locally. |
I started with Then I ran a couple collections and it seems to be resetting counters after each scrape:
Sadly, this seems to persist across restarts of the exporter as well. |
I'm not sure what capabilities you are running so that may play a factor. The one thing to note from the
Due to the fact that it is currently configured to trace all processes on the specific CPU I doubt a paranoid value of 2 (allow only user-space measurements) would work. |
Even with I'm also seeing this error after a few scrapes:
Looks like there may be a leak. |
Real dumb mistake, I was assuming that
|
A couple of documentation items:
I would add something like this to the README.md: The
See the [upstream docs](link here). |
👍 I pushed up some doc changes and cleaned up the commit history. |
6893f47
to
24c8c5b
Compare
continue | ||
} | ||
|
||
if hwProfile.CPUCycles != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to refactor this a bit, maybe by using a map and loop over it like we do in similar cases in other collectors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a couple of attempts at this but creating the map becomes rather difficult without using reflection because not all struct fields may be present. I can git it another attempt and it might save a little of the redundant checks, but would probably be slower. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I got it working pretty well now.
@hodgesds Can you address the remaining comments? |
c37234b
to
9b368d5
Compare
Updated, let me know what you think. In the future I'd like to add support for kprobes but that requires a more thought into configuration if you have any ideas (does it make sense to have a config file?). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some changes. Beside that, I still think we could do better to avoid repetition but I think it's fine for now.
Ping, please rebase this. |
906ba8d
to
f121e09
Compare
Added a test that will skip if |
Signed-off-by: Daniel Hodges <[email protected]>
Okay, looks good to me. We have worse cases of repetition and I know it's tricky. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Are these new metrics enabled by default? are there docs written somewhere for this new feature? |
See the README. |
Signed-off-by: Daniel Hodges <[email protected]>
Signed-off-by: Daniel Hodges <[email protected]>
This implements #1238 by adding perf based profiling metrics. It's still a work in progress but so far it seems to be working locally. I'd like to instrument more of the already available metrics but figured I'd put this out there for others to see.