summarize: add "aggregate" sub-command for analyzing sets of profiles. #129

eddyb · 2020-08-23T07:27:18Z

For my ongoing rdpmc instruction-counting work I've ended up with a lot of profile results (over 700 and there were hundreds I've already removed), that I don't really have any tools to analyze in-depth.

The fundamental problem is that e.g. summarize summarize will effectively sum together many small intervals between consecutive timestamps/instruction counts, so the distribution of variance in individual samples (of the clock/instruction counter), or the exact sources of large variances, are lost in the summary.

This PR seeks to address that by introducing a summarize aggregate command specifically for sets of deterministic runs (identical in all respects except the timestamps themselves). Example output:

$ summarize aggregate core-{48481,48701,48905,49453,49648,49860,50178,50587,50783,50978}
Smallest 10 durations:
  80 ns: 81 occurrences, or 0.00%
  90 ns: 10018 occurrences, or 0.59%
  91 ns: `hir_owner()`
  100 ns: 38546 occurrences, or 2.26%
  101 ns: 9 occurrences, or 0.00%
  110 ns: 49097 occurrences, or 2.87%
  111 ns: 14 occurrences, or 0.00%
  120 ns: 42200 occurrences, or 2.47%
  121 ns: 22 occurrences, or 0.00%
  130 ns: 52373 occurrences, or 3.07%

Largest 10 durations:
  111073277 ns: in `typeck()`,
    between `parent_module_from_def_id()`
        and `upvars_mentioned()`

  112291280 ns: `free_global_ctxt()`
  117943976 ns: in `typeck()`,
    between `parent_module_from_def_id()`
        and `upvars_mentioned()`

  126222248 ns: in `typeck()`,
    between `opt_const_param_of()`
        and `upvars_mentioned()`

  208308909 ns: in `typeck()`,
    between `parent_module_from_def_id()`
        and `upvars_mentioned()`

  220730751 ns: `build_hir_map()`
  221392949 ns: in `typeck()`,
    between `parent_module_from_def_id()`
        and `upvars_mentioned()`

  375357799 ns: `hir_lowering()`
  858511163 ns: `resolve_crate()`
  1139917634 ns: in `expand_crate()`, after `pre_AST_expansion_lint_checks()`

Smallest 10 variances:
  ±0 ns: 6057 occurrences, or 0.35%
  ±0.5 ns: 79 occurrences, or 0.00%
  ±4.5 ns: 8 occurrences, or 0.00%
  ±5 ns: 107368 occurrences, or 6.28%
  ±5.5 ns: 775 occurrences, or 0.05%
  ±9.5 ns: 55 occurrences, or 0.00%
  ±10 ns: 112015 occurrences, or 6.56%
  ±10.5 ns: 390 occurrences, or 0.02%
  ±14.5 ns: 42 occurrences, or 0.00%
  ±15 ns: 58164 occurrences, or 3.40%

Largest 10 variances:
  ±6107910 ns: `get_lib_features()`
  ±6942093.5 ns: `hir_lowering()`
  ±8277596.5 ns: in `privacy_access_levels()`,
    between `module_exports()`
        and `def_kind()`

  ±8340332.5 ns: in `typeck()`,
    between `parent_module_from_def_id()`
        and `upvars_mentioned()`

  ±8977432 ns: `resolve_crate()`
  ±9465039 ns: in `typeck()`,
    between `parent_module_from_def_id()`
        and `upvars_mentioned()`

  ±9681275.5 ns: in `privacy_access_levels()`, after `module_exports()`
  ±9737941 ns: in `death_checking()`,
    between `codegen_fn_attrs()`
        and `inherent_impls()`

  ±10313490.5 ns: in `expand_crate()`, after `pre_AST_expansion_lint_checks()`
  ±14306281.5 ns: `free_global_ctxt()`

For comparison, this is my best set of runs so far, for instruction counting (with rdpmc measuring both instructions:u and hardware interrupts, to account for iret overcounting, and cpuid for serialization):

$ summarize aggregate core-{047003,047937,048427,048692,048991,049165,049351,049555,049743,050036} \
    | sed 's/ ns/ instructions/'
Smallest 10 durations:
  97 instructions: 3 occurrences, or 0.00%
  99 instructions: 2 occurrences, or 0.00%
  112 instructions: `llvm_dump_timing_file()`
  114 instructions: `dep_graph_tcx_init()`
  121 instructions: `link_binary_check_files_are_writeable()`
  123 instructions: `plugin_registration()`
  157 instructions: `serialize_work_products()`
  164 instructions: `attributes_injection()`
  166 instructions: `drop_dep_graph()`
  170 instructions: in `index_hir()`, after `build_hir_map()`

Largest 10 durations:
  289482983 instructions: `early_lint_checks()`
  421515729 instructions: in `typeck()`,
    between `parent_module_from_def_id()`
        and `upvars_mentioned()`

  426300619 instructions: in `typeck()`,
    between `parent_module_from_def_id()`
        and `upvars_mentioned()`

  434275596 instructions: in `typeck()`,
    between `opt_const_param_of()`
        and `upvars_mentioned()`

  754467760 instructions: in `typeck()`,
    between `parent_module_from_def_id()`
        and `upvars_mentioned()`

  769694431 instructions: in `typeck()`,
    between `parent_module_from_def_id()`
        and `upvars_mentioned()`

  1082435704 instructions: `hir_lowering()`
  1270692563 instructions: `build_hir_map()`
  3884803038 instructions: `resolve_crate()`
  4059854999 instructions: in `expand_crate()`, after `pre_AST_expansion_lint_checks()`

Smallest 10 variances:
  ±0 instructions: 1701795 occurrences, or 99.61%
  ±0.5 instructions: 519 occurrences, or 0.03%
  ±1 instructions: 11 occurrences, or 0.00%
  ±1.5 instructions: 61 occurrences, or 0.00%
  ±2 instructions: 290 occurrences, or 0.02%
  ±2.5 instructions: 108 occurrences, or 0.01%
  ±3 instructions: 6 occurrences, or 0.00%
  ±3.5 instructions: 124 occurrences, or 0.01%
  ±4 instructions: 68 occurrences, or 0.00%
  ±4.5 instructions: 71 occurrences, or 0.00%

Largest 10 variances:
  ±217.5 instructions: in `mir_borrowck()`,
    between `mir_validated()`
        and `is_late_bound_map()`

  ±236.5 instructions: in `mir_borrowck()`,
    between `is_late_bound_map()`
        and `type_op_prove_predicate()`

  ±263.5 instructions: in `mir_borrowck()`, after `is_late_bound_map()`
  ±267.5 instructions: in `mir_borrowck()`,
    between `mir_validated()`
        and `is_late_bound_map()`

  ±354.5 instructions: in `mir_borrowck()`, after `type_op_prove_predicate()`
  ±445 instructions: in `mir_borrowck()`, after `is_late_bound_map()`
  ±468.5 instructions: in `drop_ast()`, before `analysis()`
  ±649.5 instructions: `hir_lowering()`
  ±659 instructions: `free_global_ctxt()`
  ±1147 instructions: in `expand_crate()`, after `pre_AST_expansion_lint_checks()`

This is not the intended final output, just the simplest analysis I could do, that is still somewhat indicative of the quality of the data. To be truly useful in improving that quality, it has to be able to pinpoint to outliers, not just quantify their existence.

EDIT: added some very basic indications of sources of single occurrences (and updated the example output above)
EDIT2: streamlined the single-occurrence output and included the parent (or caller) of the two interval endpoints, when it's relevant (i.e. when the interval is part of it, between one call ending and the other starting, not either callee)

wesleywiser

This looks good to me. I think all of the feedback I could give is covered by the FIXMEs 🙂

measureme/src/counters.rs

eddyb · 2020-10-26T22:37:51Z

Uhh, please ignore the extra commits, they're from rebasing on the rdpmc branch, PR for which is coming soon.
I should've rebased on the commit just before but this was more convenient.

eddyb · 2020-11-03T10:27:01Z

Not sure if I should put this into the PR description, but these are the current results with #143's instructions-minus-irqs:u (specifically for the runs used in the "Results" section):

Smallest 10 durations:
  119 instructions: `llvm_dump_timing_file()`
  122 instructions: `dep_graph_tcx_init()`
  129 instructions: `link_binary_check_files_are_writeable()`
  131 instructions: `plugin_registration()`
  170 instructions: `serialize_work_products()`
  171 instructions: `drop_dep_graph()`
  176 instructions: `attributes_injection()`
  208 instructions: 3 occurrences, or 0.00%
  209 instructions: `link_binary_remove_temps()`
  210 instructions: 2 occurrences, or 0.00%

Largest 10 durations:
  437874189 instructions: `early_lint_checks()`
  456828967 instructions: in `typeck()`,
    between `parent_module_from_def_id()`
        and `upvars_mentioned()`

  469996314 instructions: in `typeck()`,
    between `parent_module_from_def_id()`
        and `upvars_mentioned()`

  501380829 instructions: `drop_ast()`
  678063701 instructions: in `typeck()`,
    between `opt_const_param_of()`
        and `in_scope_traits_map()`

  679879471 instructions: in `typeck()`,
    between `opt_const_param_of()`
        and `in_scope_traits_map()`

  1381363724 instructions: `hir_lowering()`
  2099521684 instructions: `build_hir_map()`
  3951196865 instructions: `resolve_crate()`
  6541458129 instructions: in `expand_crate()`, after `pre_AST_expansion_lint_checks()`

Smallest 4 variances:
  ±0 instructions: 1903452 occurrences, or 99.98%
  ±0.5 instructions: 423 occurrences, or 0.02%
  ±1 instructions: 3 occurrences, or 0.00%
  ±4.5 instructions: `write_crate_metadata()`

Largest 4 variances:
  ±0 instructions: 1903452 occurrences, or 99.98%
  ±0.5 instructions: 423 occurrences, or 0.02%
  ±1 instructions: 3 occurrences, or 0.00%
  ±4.5 instructions: `write_crate_metadata()`

(Should probably not show the "Largest N variances" entries that are also in the "Smallest N variances")

eddyb · 2020-11-23T00:46:40Z

This looks good to me. I think all of the feedback I could give is covered by the FIXMEs 🙂

Do you think any of them should block merging? I think this could be useful to anyone who wants to reproduce the rdpmc experiments we did.

wesleywiser

Do you think any of them should block merging? I think this could be useful to anyone who wants to reproduce the rdpmc experiments we did.

No, I don't think so. Sorry for the delay on this, I wanted to review it again since it's been a while.

wesleywiser · 2020-11-26T01:23:50Z

summarize/src/main.rs

+        .map(|file| ProfilingData::new(&file))
+        .collect::<Result<Vec<_>, _>>()?;
+
+    // FIXME(eddyb) return some kind of serializable data structure from `aggregate_profiles`.


This would be nice to have since it would probably be part of any plans to implement the aggregation feature for perf.rlo.

summarize/src/aggregate.rs

Co-authored-by: Wesley Wiser <[email protected]>

eddyb force-pushed the aggregate branch from bb1d2ea to 3536240 Compare August 23, 2020 17:43

wesleywiser reviewed Aug 26, 2020

View reviewed changes

wesleywiser self-assigned this Aug 26, 2020

eddyb force-pushed the aggregate branch 5 times, most recently from e79bac9 to ad97fb3 Compare September 8, 2020 20:52

eddyb force-pushed the aggregate branch 2 times, most recently from ef25949 to ccbbe4c Compare October 25, 2020 09:29

wesleywiser reviewed Oct 26, 2020

View reviewed changes

measureme/src/counters.rs Outdated Show resolved Hide resolved

measureme/src/counters.rs Outdated Show resolved Hide resolved

measureme/src/counters.rs Outdated Show resolved Hide resolved

measureme/src/counters.rs Outdated Show resolved Hide resolved

summarize: add "aggregate" sub-command for analyzing sets of profiles.

15a4158

eddyb force-pushed the aggregate branch from ccbbe4c to 15a4158 Compare November 3, 2020 10:23

eddyb marked this pull request as ready for review November 23, 2020 00:44

eddyb changed the title ~~[WIP] summarize: add "aggregate" sub-command for analyzing sets of profiles.~~ summarize: add "aggregate" sub-command for analyzing sets of profiles. Nov 23, 2020

wesleywiser approved these changes Nov 26, 2020

View reviewed changes

Fix nit in summarize/src/aggregate.rs

66f6def

Co-authored-by: Wesley Wiser <[email protected]>

wesleywiser merged commit 61e3581 into rust-lang:master Feb 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

summarize: add "aggregate" sub-command for analyzing sets of profiles. #129

summarize: add "aggregate" sub-command for analyzing sets of profiles. #129

eddyb commented Aug 23, 2020 •

edited

Loading

wesleywiser left a comment

eddyb commented Oct 26, 2020

eddyb commented Nov 3, 2020

eddyb commented Nov 23, 2020

wesleywiser left a comment

wesleywiser Nov 26, 2020

summarize: add "aggregate" sub-command for analyzing sets of profiles. #129

summarize: add "aggregate" sub-command for analyzing sets of profiles. #129

Conversation

eddyb commented Aug 23, 2020 • edited Loading

wesleywiser left a comment

Choose a reason for hiding this comment

eddyb commented Oct 26, 2020

eddyb commented Nov 3, 2020

eddyb commented Nov 23, 2020

wesleywiser left a comment

Choose a reason for hiding this comment

wesleywiser Nov 26, 2020

Choose a reason for hiding this comment

eddyb commented Aug 23, 2020 •

edited

Loading