Skip to content
This repository has been archived by the owner on Jul 21, 2019. It is now read-only.

Build performance analysis #643

Open
tgross opened this issue Aug 25, 2018 · 6 comments
Open

Build performance analysis #643

tgross opened this issue Aug 25, 2018 · 6 comments

Comments

@tgross
Copy link
Contributor

tgross commented Aug 25, 2018

Using the hugo --stepAnalysis option we'll find that, as we'd expect, the biggest amount of time is being spent in the "render and write pages" steps (~60 seconds overall compared to millseconds for the other steps).

We can use hugo --templateMetrics to break this down. Running this a couple times to make sure any disk cache is warmed up (although it didn't really make a difference), I get the following:

$ hugo --templateMetrics
Building sites …
Template Metrics:

     cumulative       average       maximum
       duration      duration      duration  count  template
     ----------      --------      --------  -----  --------
  1m17.995022823s   20.364235ms   87.373313ms   3830  partials/footer.html
  1m17.27805573s   59.171558ms   172.15785ms   1306  talk/single.html
  1m4.596311572s   50.505325ms    124.3805ms   1279  speaker/single.html
  40.976022719s  602.588569ms  2.238049589s     68  program/single.html
  31.163888904s     32.8733ms   98.139365ms    948  event/single.html
  16.358431432s     4.27113ms   39.009423ms   3830  partials/head.html
  13.780003964s    3.597912ms   31.973227ms   3830  partials/head/seo.html
   7.063846619s    1.910696ms   19.516799ms   3697  partials/sponsors.html
   6.793879509s    1.773858ms   20.901307ms   3830  partials/head/seo/open_graph.html
   5.630256291s     1.47004ms    18.66299ms   3830  partials/head/seo/twitter_cards.html
   3.759348706s   55.284539ms  100.522296ms     68  speakers/single.html
   3.204683581s    33.38212ms    80.29072ms     96  welcome/single.html
   2.599846197s     678.811µs   11.047846ms   3830  partials/footer_scripts.html
   1.893856439s    1.814038ms   10.184713ms   1044  partials/events/event_metadata.html
   1.481349819s     386.775µs   14.390457ms   3830  partials/head_includes.html
   1.283811567s     338.647µs    14.35677ms   3791  partials/events/event_navbar.html
   1.204243451s     314.423µs   22.983662ms   3830  partials/head/seo/schema.html
   1.015476969s    42.31154ms   67.800444ms     24  blog/single.html
   959.264588ms   28.213664ms    59.11527ms     34  _default/single.html
   569.009769ms    3.979089ms   10.815179ms    143  partials/welcome.html
   517.457233ms   13.617295ms   32.827341ms     38  partials/future.html
   506.543895ms     132.256µs   13.611213ms   3830  partials/global_navbar.html
   361.180423ms      94.302µs     8.99108ms   3830  partials/google_analytics.html
   241.433581ms  241.433581ms  241.433581ms      1  events/single.html
   212.039645ms  212.039645ms  212.039645ms      1  partials/past.html
   167.433621ms    1.418928ms    4.282184ms    118  shortcodes/event_logo.html
   141.419853ms   70.709926ms  139.768148ms      2  _internal/_default/rss.xml
    98.010874ms       25.59µs     401.257µs   3830  partials/meta.html
    96.959016ms   96.959016ms   96.959016ms      1  index.html
     91.01287ms   30.337623ms   31.888345ms      3  section/blog.html
    69.631276ms   69.631276ms   69.631276ms      1  _internal/_default/sitemap.xml
    69.443485ms   69.443485ms   69.443485ms      1  speaking/single.html
    62.680491ms     435.281µs    3.336465ms    144  shortcodes/list_organizers.html
    59.129445ms        69.4µs     735.991µs    852  shortcodes/event_link.html
    56.235845ms     390.526µs    20.92267ms    144  shortcodes/list_core.html
    55.062137ms      152.95µs   13.932607ms    360  shortcodes/email_organizers.html
     20.76961ms     145.242µs    1.219671ms    143  partials/events/cta.html
    19.230417ms   19.230417ms   19.230417ms      1  404.html
    17.544956ms     130.932µs     587.166µs    134  shortcodes/cfp_dates.html
    15.964269ms   15.964269ms   15.964269ms      1  partials/speaking.html
    13.784635ms     119.866µs    1.187868ms    115  shortcodes/event_map.html
    12.359809ms      66.809µs     356.611µs    185  shortcodes/event_start.html
    11.896248ms   11.896248ms   11.896248ms      1  section/events.rss.xml
     8.102662ms    8.102662ms    8.102662ms      1  partials/map.html
     7.876521ms      52.162µs     316.568µs    151  shortcodes/event_location.html
     6.605939ms       40.28µs     249.934µs    164  shortcodes/event_end.html
     5.913856ms      41.355µs     258.694µs    143  shortcodes/event_twitter.html
     5.261609ms    5.261609ms    5.261609ms      1  section/speaking.rss.xml
     3.993664ms      50.552µs     258.007µs     79  shortcodes/email_proposals.html
     3.361318ms    3.361318ms    3.361318ms      1  section/blog.rss.xml
     1.360909ms      56.704µs     652.551µs     24  blog/summary.html
      796.904µs     265.634µs       724.5µs      3  shortcodes/google_form.html
      636.562µs      159.14µs     462.615µs      4  shortcodes/registration_start.html
      516.528µs     129.132µs     397.251µs      4  shortcodes/registration_end.html
      511.324µs     170.441µs     508.691µs      3  _default/list.html
      435.761µs      217.88µs     383.803µs      2  _internal/shortcodes/figure.html
      425.574µs     141.858µs     266.418µs      3  partials/blog_pagination.html
        420.5µs       420.5µs       420.5µs      1  shortcodes/privacy_policy.html
      328.436µs     328.436µs     328.436µs      1  _internal/shortcodes/youtube.html


                   |  EN
+------------------+------+
  Pages            | 3847
  Paginator pages  |    2
  Non-page files   |   67
  Static files     | 6266
  Processed images |    0
  Aliases          |  347
  Sitemaps         |    1
  Cleaned          |    0

Total in 57247 ms

The immediate low-hanging fruit here is the partials/footer.html which is taking up ~60s of CPU time all by itself. The footer has nothing specific to a page on it, so we can use the partialCached function on it. Because this is called in every page this ends up saving us almost half our rendering time right off the bat.

After caching the footer:

$ hugo --templateMetrics
Building sites …
Template Metrics:

     cumulative       average       maximum
       duration      duration      duration  count  template
     ----------      --------      --------  -----  --------
  41.411188753s   31.708414ms  189.413536ms   1306  talk/single.html
  40.120164067s   31.368384ms   77.390703ms   1279  speaker/single.html
  34.756380122s  511.123237ms  1.586236629s     68  program/single.html
   13.11547194s    3.424405ms   28.981152ms   3830  partials/head.html
  11.028678939s     2.87955ms   23.307328ms   3830  partials/head/seo.html
    6.70176324s     7.06937ms   34.614808ms    948  event/single.html
    6.05451722s    1.637683ms   14.983533ms   3697  partials/sponsors.html
   5.387492342s    1.406655ms   20.829883ms   3830  partials/head/seo/open_graph.html
   4.629631978s    1.208781ms   21.779831ms   3830  partials/head/seo/twitter_cards.html
   2.158141125s   31.737369ms   56.709983ms     68  speakers/single.html
   1.779862497s     464.716µs    8.390858ms   3830  partials/footer_scripts.html
   1.266605005s    1.213223ms    7.715693ms   1044  partials/events/event_metadata.html
   1.216056198s     317.508µs    8.950679ms   3830  partials/head_includes.html
   1.094646464s     288.748µs     6.59618ms   3791  partials/events/event_navbar.html
   1.043537285s    10.87018ms   26.050307ms     96  welcome/single.html
...
   (many more)
...
25.201791ms   25.201791ms   25.201791ms      1  partials/footer.html

Total in 32700 ms

That saves us a lot!


Some other areas of improvement to be made:

     cumulative       average       maximum
       duration      duration      duration  count  template
     ----------      --------      --------  -----  --------
  41.411188753s   31.708414ms  189.413536ms   1306  talk/single.html
  40.120164067s   31.368384ms   77.390703ms   1279  speaker/single.html

The talk and speaker pages are unique so we can't avoid rendering these a lot. But they are relatively expensive so if we can make some improvements on these it'll bring down overall time.

A look at the talk/single.html template shows that in order to find data about the speaker we're looping over all the speakers that have ever spoken! If instead we use the .GetPage (with proper nil-testing), we can avoid doing all that work.

After making that fix:

     cumulative       average       maximum
       duration      duration      duration  count  template
     ----------      --------      --------  -----  --------
   9.401234875s    7.198495ms   34.449072ms   1306  talk/single.html
...
Total in 29238 ms

We do the same thing for the speaker page, where we iterate over all the talks to find the just the ones they're giving at that event. After making that fix for speaker pages too:

     cumulative       average       maximum
       duration      duration      duration  count  template
     ----------      --------      --------  -----  --------
   7.668514759s    5.995711ms   32.592151ms   1279  speaker/single.html
...
Total in 16277 ms

I've opened a PR with these three big-ticket items.


A future area to examine is the program page:

     cumulative       average       maximum
       duration      duration      duration  count  template
     ----------      --------      --------  -----  --------
  33.957641921s  499.377087ms  1.547919927s     68  program/single.html

The program page is very expensive on an individual basis. So optmizing that would have a big impact.


Also for future work, smaller but harder-fought wins could be found in some of the other partials:

     cumulative       average       maximum
       duration      duration      duration  count  template
     ----------      --------      --------  -----  --------
  13.076714781s    3.414285ms   36.539806ms   3830  partials/head.html
  11.006667232s    2.873803ms   35.556077ms   3830  partials/head/seo.html
   8.106079242s    6.206798ms    24.68941ms   1306  talk/single.html
   7.668514759s    5.995711ms   32.592151ms   1279  speaker/single.html
   6.906285184s     7.28511ms   33.678155ms    948  event/single.html
   6.228540789s    1.684755ms   15.531651ms   3697  partials/sponsors.html
   5.261766705s    1.373829ms   25.380798ms   3830  partials/head/seo/open_graph.html
   4.725021227s    1.233687ms   33.282696ms   3830  partials/head/seo/twitter_cards.html
   2.555655355s   37.583166ms   68.366637ms     68  speakers/single.html
   1.794808041s     468.618µs    12.21502ms   3830  partials/footer_scripts.html
   1.290601801s    1.236208ms    5.400415ms   1044  partials/events/event_metadata.html
   1.224713451s     319.768µs   11.675193ms   3830  partials/head_includes.html

Probably no way around rendering most of these for each page, but each renders very quickly. If we could extract cacheable partials from these that might reduce some time here but we'd want to test each change.

cc @bridgetkromhout @mattstratton

@bridgetkromhout
Copy link
Contributor

This is super exciting. Thanks for taking the time to test performance and give us such a detailed analysis, @tgross! This is going to help local development, but I also suspect it might help with our netlify build times too.

@mattstratton
Copy link
Member

I bet that I’m doing something similar with the program page of looping through all the talks, etc, so it probably can be made much better!

@bridgetkromhout
Copy link
Contributor

bridgetkromhout commented Sep 13, 2018

Uh-oh! @adrianmoisey noticed that unfortunately the change in #644 had the unintended side effect of making it so when one speaker has two talks at a given event, only one of them shows up. Example: https://www.devopsdays.org/events/2018-portland/speakers/heidi-waterhouse/ - she has an ignite as well, and it shows up if I revert the changes to the speaker page that were added in this PR:

screen shot 2018-09-13 at 12 46 19 pm

Checking in with @tgross to see if it would be trivial for him to adjust that, or if in the interests of "oops, this affects an ongoing event today" I should revert those specific changes and we can revisit?

Heidi is a developer advocate with LaunchDarkly. She delights in working at the intersection of usability, risk reduction, and cutting-edge technology. One of her favorite hobbies is talking to developers about things they already knew but had never thought of that way before. She sews all her conference dresses so that she’s sure there is a pocket for the mic.

@tgross
Copy link
Contributor Author

tgross commented Sep 13, 2018

I would revert.

The change I made relies on the URL construction to query for the correct page for the talk, but this turns out to be an incorrect assumption when there's multiple talks by the same speaker because (obviously) the two talks don't have the same URL. I'll need a bit of time to mess around with a more reliable query there, so I would recommend reverting in the meantime.

@bridgetkromhout
Copy link
Contributor

Cool, thanks. Indeed, URL will vary - it can and will differ (of course) but also it can be a workshop name or any arbitrary string not containing speaker name.

Should I revert the whole PR or just the speaker page bit, do you think?

@tgross
Copy link
Contributor Author

tgross commented Sep 13, 2018

It's easier for me to land the fix later if you only revert the speaker page bit but probably easier for you to revert the whole commit. So whichever you want 😀

tgross added a commit to tgross/devopsdays-theme that referenced this issue Sep 13, 2018
The performance improvement in devopsdays#644 to the speaker page implicitly
(and incorrectly) assumes that a speaker will give a single talk
per event. Rather than using .GetPage which can return only a single
page, we can change the original call to .Site.Pages and get the
correct behavior with at least some of the performance improvement.

This could be further improved if the program YAML included the event
name, or if Hugo supported a "contains" or "startswith" operator in
the where function.

Fixes devopsdays#643 (comment)
Replaces devopsdays-web f944fea71d071b959f715e9cdcb0ebb00f52c978
tgross added a commit to tgross/devopsdays-theme that referenced this issue Sep 13, 2018
The performance improvement in devopsdays#644 to the speaker page implicitly
(and incorrectly) assumes that a speaker will give a single talk
per event. Rather than using .GetPage which can return only a single
page, we can change the original call to .Site.Pages and get the
correct behavior with at least some of the performance improvement.

This could be further improved if the program YAML included the event
name, or if Hugo supported a "contains" or "startswith" operator in
the where function.

Fixes devopsdays#643 (comment)
Replaces devopsdays-web f944fea71d071b959f715e9cdcb0ebb00f52c978
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants