Flame Graph support on Web UI #188

spiermar · 2017-08-10T00:55:42Z

Implements #166

Details:

Basic flame graph implementation under http output.
Tested with both CPU and Memory profiles.
Endpoint is /flamegraph, and there's a link in /,
Opted not to use the directed graph to generate the flame graph since it doesn't have the detailed information required. Parsed the raw samples instead.
Not doing any filtering in the samples.
Output is mostly self-contained and could be saved. It should also be easy to add an option to export the html file.
Making available all sample types in the UI, via a drop-down. e.g.: CPU profile will have samples and cpu.
Displaying some profile metadata like profile time and duration.
Converting nanoseconds to seconds. I'm not sure if this is ideal, but same thing is being done in the main view. Happy to revert it if necessary.
Using a bootstrap layout now, but happy to revert it if you believe it would be better to keep a simple layout without the bootstrap dependency.
Implements all features mentioned by @brendangregg, except different palettes, which is under development and requires changes in the D3 plugin.
We are continuously adding more features to the D3 plugin, and those should be easily assimilated by this.

A demo can be found in pprof_flame.html

Happy to make any changes deemed necessary.

codecov-io · 2017-08-10T01:12:53Z

Codecov Report

Merging #188 into master will increase coverage by 0.55%.
The diff coverage is 95.68%.

@@            Coverage Diff             @@
##           master     #188      +/-   ##
==========================================
+ Coverage   65.29%   65.84%   +0.55%     
==========================================
  Files          34       35       +1     
  Lines        7212     7346     +134     
==========================================
+ Hits         4709     4837     +128     
- Misses       2109     2113       +4     
- Partials      394      396       +2

Impacted Files	Coverage Δ
internal/driver/webui.go	`58.44% <100%> (+0.19%)`	⬆️
internal/driver/webhtml.go	`100% <100%> (ø)`	⬆️
internal/driver/flamegraph.go	`85% <85%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 71d5bad...7b1f77e. Read the comment docs.

aalexand · 2017-08-10T01:26:19Z

Given that pprof is vendored into Golang sources and given that it's used internally at Google, I am skeptical about us being able to accept the source which <script>'s a number of locations on the Internet. @rsc @rauls5382 @rakyll

spiermar · 2017-08-10T01:53:31Z

@aalexand If that's a concern, it should be possible to source those files from pprof's http server.

rauls5382 · 2017-08-10T14:39:21Z

Thank you. This looks pretty cool.

I agree we should freeze the version of all scripts/plugins and serve them from the pprof http server. That way pprof won't be broken if there are incompatible changes in the future.

The Go distribution would have to vendor any dependent packages as well, similar to what they already do to pick up the ianlancetaylor/demangle package. If they decide against it (which I doubt), we could find a way to make it easy to trim that functionality via plugins.

spiermar · 2017-08-10T17:24:30Z

Thanks @rauls5382

Versioning shouldn't theoretically be a problem, since all external resource references are versioned CDN urls. Security-wise, the external resource could be compromised, so I've added integrity check to all resources to be on the safe side.

If vendor the dependent packages is a better alternative, there are a few different ways of getting this done, and I'm not sure what would be best option, from a pprof perspective. Let me know and I can make the changes.

rakyll · 2017-08-10T21:58:47Z

@spiermar Can you compile all the external scripts in a single one and embed the final artifact in a Go file? Having said that, I am not sure how licensing will work. @aalexand, @rauls5382, do we need a separate in the repo with a licensing file to be able to vendor the external dependencies?

spiermar · 2017-08-10T22:10:16Z

@rakyll already have the scripts to create combined and minified vendor.js and vendor.css files. Licensing is the part needs checking. The flame graph plugin is Apache 2.0, so not a problem. d3 is BSD-3 and d3-tip is MIT. Already removed the lodash dependency in the last commit, so not a problem anymore. Bootstrap could be removed.

aalexand · 2017-08-14T22:17:40Z

@rakyll Yes, I would probably expect something like third_party/d3 and third_party/flame-something-something for the dependencies along with the LICENSE files in there. It would be good to pre-check with the Go team maybe. What concerns me the most is having pprof which is a local tool fetch any resources from elsewhere on the web. Security aside, pprof and "go tool pprof" should be functional without requiring internet access.

spiermar · 2017-08-14T22:35:12Z

I'll make the changes later today

aalexand · 2017-08-14T22:44:29Z

https://github.com/google/pprof/blob/master/third_party/svg/svgpan.go is along the lines of what we want.

spiermar · 2017-08-15T05:59:21Z

JavaScript resources are inline now and I've also removed the bootstrap dependency. License files are also checked in.
Considered using the minified versions of the scripts, but decided to follow the same pattern from the svgPan third party library.
Same for inline vs. independent resource endpoints. Inlining is not ideal for multiple reasons, but followed the same pattern as svgPan. On the plus side, the HTML can be saved this way.
View could benefit from a few UI/UX improvements, but it's working.

aalexand · 2017-08-15T17:11:17Z

Thanks. Some comments:

The code coverage went down. Can you add tests to cover the new code?
I tried opening the flame graph with a 5 MB profile.pb.gz file (which I can't immediately share), the graph view choked on that. Do you want to test with some large profiles to see whether there are some optimizations to do?
I would expect a bit more unification between the graph view and the flame graph view. More specifically:
- The metric selection (e.g. samples vs. cpu) should be present on both pages.
- The filtering like focus / ignore / hide / show should be present on both pages.
- There should be a way back from the flame view back to the graph.

spiermar · 2017-08-15T18:03:57Z

Yes, had it on my list to add a few tests.
The sluggishness on large flame graphs comes from the large number of rect that need to be drawn. Generally because of tons of small value elements that can hardly be seen without zooming. I'm looking into hiding small frames (minimum size for flame elements spiermar/d3-flame-graph#51), but that's not done yet. A temporary fix might be filtering before creating the data structure, like the directed graph, but that reduces accuracy. I can add filtering, but would also like to give the user an option to disable it.
Agree with the metric selection on both views. I just need to understand how to get that out of the report to add a selection in the other view.
The focus/ignore/hide/show filtering is a bit more tricky since they are not implemented in the plugin and generally not how users interact with flame graphs. Zooming and searching are the basic interactions. @brendangregg, any thoughts?
I will add a link back to the graph.

spiermar · 2017-11-19T02:11:13Z

@aalexand Finally got a bit of time to get back to this. The call_tree option is done, and also the percentage of total. That's all I had on my list. Let me know how that looks.

aalexand

Thanks!

Is the current code rebased to the latest master?

aalexand · 2017-11-19T05:57:32Z

third_party/d3tip/d3_tip.go

+package d3tip
+
+// D3TIP returns the d3-tip.js file
+const D3TIP = `


D3TIP is not in style compliant with Go style. Should be thisStyle (for package private) or ThisStyle (for public). For this variable, how about something like JSSource or Source?

This comment applies to d3.go and d3_flame_graph.go as well.

aalexand · 2017-11-19T05:59:16Z

third_party/d3tip/d3_tip.go

+
+// D3TIP returns the d3-tip.js file
+const D3TIP = `
+{{define "d3tipscript"}}


For consistency with svgpan.go can we move the template definition part out of this file and only have the JS source constant defined here?

This comment applies to d3.go and d3_flame_graph.go as well.

aalexand · 2017-11-19T06:00:35Z

internal/driver/flamegraph.go

+
+// percentage computes the percentage of total of a value, and encodes
+// it as a string. At least two digits of precision are printed.
+func percentage(value, total int64) string {


I I think we should do #265 and use measurement.Percentage in the code below so that this function can be dropped and so that the percentage formatting is enforced to be consistent.

Should I wait until it gets merged?

I merged that, you can rebase.

aalexand · 2017-11-19T06:03:20Z

internal/driver/flamegraph.go

+	// Calculate root value
+	rootValue := int64(0)
+	for _, n := range nodes[0:nroots] {
+		rootValue = rootValue + n.Cum


Nit: this can be done near nroots++ above, no need to have a separate loop.

spiermar · 2017-11-19T06:17:43Z

@aalexand Yes, it was rebased.

aalexand

I also wonder if the size of the dependencies can be further reduced. I saw d3 is composed of microlibraries. Do we need all of d3 here or we can limit the dependency to a couple of those?

spiermar · 2017-11-19T18:40:50Z

I could create a custom D3 build. Let me check how much that would reduce the bundle size.

aalexand · 2017-11-19T23:46:05Z

@spiermar Great, thanks a lot, that's much smaller! Can you add a README.md in the d3 directory which documents how the steps you did can be repeated to upgrade from upstream when we need to?

aalexand · 2017-11-20T00:10:06Z

internal/driver/flamegraph.go

+// flamegraph generates a web page containing a flamegraph.
+func (ui *webInterface) flamegraph(w http.ResponseWriter, req *http.Request) {
+	// Force the call tree so that the graph is a tree.
+	rpt, errList := ui.makeReport(w, req, []string{"svg"}, "call_tree", "true")


Please update this to

// Force the call tree so that the graph is a tree. Also do not trim the tree // so that the flame graph contains all functions. rpt, errList := ui.makeReport(w, req, []string{"svg"}, "call_tree", "true", "trim", "false")

The GetDOT call trims the tree by default, and for the flame graph I think it's reasonable to override that.

Up to you. I prefer to have the flame graph contain everything, but that's not the default behavior for the directed graph, so I wasn't really sure what would the users prefer.

The graph does it by default because displaying the full graph for large programs becomes messy. Flame graph appears to be compact and readable enough on some larger profiles I tried so I think trimming the tree would be more confusing that useful here.

aalexand · 2017-11-20T00:11:47Z

third_party/d3tip/d3_tip.go

+
+// Source returns the d3-tip.js file
+const Source = `
+{{define "d3tipscript"}}


How difficult it is to push out the template directive out of these files? This is a nit, I am just trying to make all third party files consistent here - like svgpan they should only contain the JS source as is, without the additional directives so that they are plain simple and once updated to a newer version can be easily upgraded without accidentally losing any content.

Should be simple, just move the directives. It's being called only in one place. Will check it later tonight.

aalexand · 2017-11-20T00:19:01Z

@spiermar Overall I think it looks good now. One thing: would you be open (and would it be difficult) to switch the flame graph to grow down rather than up? We found recently that with deep flame graphs it looks more natural in the UI when they grow down and you have to scroll to get the finer level of details (deeper elements) rather than having to scroll just to see the root. See the attached example.

spiermar · 2017-11-20T00:59:18Z

@aalexand it's in the roadmap for the d3 flame graph plugin (spiermar/d3-flame-graph#73), but I don't have a target date for the release yet.

brendangregg · 2017-11-20T04:55:09Z

My original flamegraph.pl did this layout too (icicle), but here's how I interpret each flame graph:

sweep top down and look for what's on-CPU, especially any large plateaus, to identify what the CPU is doing.
then sweep bottom-up to see how we got there, which is generally quicker after knowing (1).

Last week I had large plateau of UTF-8 processing (51%), then swept bottom up to look for UTF-8 or string-related functions, skipping the rest (a lot of framework frames). This meant I could find the relevant UTF-8/string frames more quickly, after knowing the on-CPU context first.

Another recent one was gzip. Another was kernel network interrupt processing -- reading that bottom up would be a waste of time, since what's ultimately burning CPU is unrelated to the user-level program that's running.

Reading bottom-up (root first) still gets the job done. My point is that knowing some on-CPU context first can make that quicker, as you have a clue as to what you're looking for.

I've used the icicle layout when reversing the stack order as well, so that the top frame remains what is on-CPU, and beneath it is ancestry.

That's one of the big reasons we wanted to rewrite this in d3, so that flipping the layout direction and merge direction could be button presses in an interactive d3 graph. Doing it with my Perl program meant rerunning the Perl program.

aalexand · 2017-11-20T20:56:59Z

@spiermar Note a couple of other open comments I made.

spiermar · 2017-11-20T21:11:18Z

@aalexand regarding the D3 build, sure. I used rollup to build the distribution and the configuration is on my Github page (https://github.com/spiermar/d3-pprof). With that it's just a single command to generate the build. If it's Ok to link to the repo, I'll add the instructions to a README.md file on pprof and also update the repo with the instructions.

aalexand · 2017-11-20T21:26:04Z

@spiermar I'd prefer to include self-contained instructions on how to do the build inside of pprof than point to another repo. This is to keep the deps minimal, even if just for the build process.

# Conflicts: # CONTRIBUTORS

…ions

Implements google#166.

spiermar added 10 commits August 8, 2017 18:00

feat: working flame graph web view

06f02b6

feat: select flamegraph sample type

db1575c

feat: profile details

7f9e781

feat: dynamic unit definition

8c11fce

feat: button from main view

b6e1efc

feat: dynamic chart height

a55d5f9

feat: converting nanoseconds to seconds

acfe4b9

feat: selecting default metric based on sample_index variable

faa772e

Merge branch 'master' into feat_flamegraph

96445d3

feat: better cpu time conversion

df97f08

googlebot added the cla: yes label Aug 10, 2017

spiermar mentioned this pull request Aug 10, 2017

Web UI should support flame graphs #166

Closed

feat: external resource integrity check

5bd8ef3

reafactor: dropping lodash script dependency

5d827ef

feat: friendly error message if an external resource failed to load

edb9617

spiermar added 2 commits August 14, 2017 20:23

refactor: serving javascript and css resources inline

0afa5df

refactor: removing bootstrap dependency

a397cfc

spiermar added 2 commits November 18, 2017 17:57

feat: percentage of total

46ed7a2

Merge branch 'master' into feat_flamegraph

370fc15

aalexand reviewed Nov 19, 2017

View reviewed changes

refactor: style adjustments

0e1724b

aalexand reviewed Nov 19, 2017

View reviewed changes

refactor: custom d3 build

b9de433

aalexand reviewed Nov 20, 2017

View reviewed changes

aalexand mentioned this pull request Nov 20, 2017

Update the web user interface aesthetically #263

Merged

spiermar added 6 commits November 20, 2017 18:27

Merge branch 'master' into feat_flamegraph

7307bc6

# Conflicts: # CONTRIBUTORS

refactor: use measurement package percentage function

49a0976

refactor: remove template directive from third_party files

c548705

func: do not trim the tree so that the flame graph contains all funct…

f465f8a

…ions

docs: instructions on how to built the custom d3 bundle

1bd096a

docs: style improvement on d3 readme

7b1f77e

aalexand approved these changes Nov 21, 2017

View reviewed changes

aalexand merged commit 80fc05d into google:master Nov 21, 2017

sean- mentioned this pull request Nov 21, 2017

Need to improve performance compared to TCP quic-go/quic-go#790

Closed

nolanmar511 mentioned this pull request Jan 29, 2018

internal/driver: test timed out after 3m0s on freebsd, openbsd, windows #300

Closed

gmarin13 pushed a commit to gmarin13/pprof that referenced this pull request Dec 17, 2020

Flame Graph support on Web UI (google#188)

738874e

Implements google#166.

Flame Graph support on Web UI #188

Flame Graph support on Web UI #188

Conversation

spiermar commented Aug 10, 2017

codecov-io commented Aug 10, 2017 • edited Loading

Codecov Report

aalexand commented Aug 10, 2017

spiermar commented Aug 10, 2017

rauls5382 commented Aug 10, 2017

spiermar commented Aug 10, 2017

rakyll commented Aug 10, 2017 • edited Loading

spiermar commented Aug 10, 2017 • edited Loading

aalexand commented Aug 14, 2017

spiermar commented Aug 14, 2017

aalexand commented Aug 14, 2017

spiermar commented Aug 15, 2017

aalexand commented Aug 15, 2017

spiermar commented Aug 15, 2017

spiermar commented Nov 19, 2017

aalexand left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spiermar commented Nov 19, 2017

aalexand left a comment

Choose a reason for hiding this comment

spiermar commented Nov 19, 2017

aalexand commented Nov 19, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aalexand commented Nov 20, 2017

spiermar commented Nov 20, 2017

brendangregg commented Nov 20, 2017

aalexand commented Nov 20, 2017

spiermar commented Nov 20, 2017

aalexand commented Nov 20, 2017

codecov-io commented Aug 10, 2017 •

edited

Loading

rakyll commented Aug 10, 2017 •

edited

Loading

spiermar commented Aug 10, 2017 •

edited

Loading