Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPIKE] Collecting performance information #3239

Conversation

romaricpascal
Copy link
Member

@romaricpascal romaricpascal commented Feb 2, 2023

This PR explores how we may collect stats about the size of our built files, as well as different ways we can access these stats. It looks at two sets of files:

  1. the minified builds in dist, that'll likely be loaded in a browser straight away
  2. the ES modules provided in package, likely to be consumed by a bundler to build the final JS app

Warning The code in this PR needs to be thrown away and re-built tidily, this is just a proof of concept!

Collecting the stats

fs.stat

Use Node to get the size of the file on disk using fs.stat.

This is the method used for assessing the files in dist as each Kb in these files is a Kb the browser will load.

Getting more information like the number of modules in the bundles didn't feel useful for the files in dist as they bundle the entirety of the library anyway.

rollup-plugin-visualizer

rollup-plugin-visualizer is a plugin that collects statistics during a Rollup build and outputs them, either as interactive HTML visualisations or raw data (YAML or JSON).

This method is used for scrutinising the ES Modules. It allows us to get a more information than the file on disk. By bundling a file that exports the module we want information on, it can let us know which other modules will get included, as well as how much each will contribute (accounting for tree-shaking). It also accounts for tree-shaking, when only part of a module would end up in the final bundle.

Unlike fs.stats which looks at the minified size on disk, this looks at the unminified code (even with @rollup/plugin-terser in the build).

Note The plugin could also be added to our own build process as we bundle the files in package/govuk. However this would have to wait until we're no longer stuck on an old version of Rollup.

Accessing the stats

The PR updates the test workflow to add a stat collection job upon PR. It also extracts the steps that were building dist and package to their own jobs as those are pre-requisite to collecting the stats.

Note The organisation of the jobs and steps is completely up for grabs, I just did what I could to make things work

On the PR as comment

To get quick access to the information, the workflow adds a comment on the PR with the information it collected. It'll reuse the same comment if new pushes are made on the PR (using specific content in the comment to identify it)

For the modules, we can extract quite a lot of information from the raw data exported by rollup-plugin-visualizer and format it however we want. We'll likely want to use a few <details> element to not have that comment be too overwhelming. We'll also have to keep in mind the 65536 characters limit for comments on Github.

Downloadable archive

Besides the raw data we can format on the PR comment, rollup-plugin-visualizer also creates HTML visualisations. Those are zipped and attached to the Github action's artifacts so they can be downloaded.

To speed up access, the PR also adds a link in the comment.

### Visualisation in the review app

The PR also adds access to the visualisations in the review app. It adds the stats collection to the Procfile so it runs before startup and serves them under /stats.

A link to the stats is added to the PR comment as well (and serve-index provides a cheap way to browse the files).

Warning I forgot to add the rebuild of dist and package, so that would need being added when we do it properly. The PR still shows how this may work in terms of UX, should we want it.

How it works

The work for collecting the ES module stats and commenting on Github is isolated in its own npm workspace, inside the stats folder.

The collection of file sizes for the dist folder happens in the script doing the comment itself, as it's only fs.stats calls. We could look to extract it in its own function come the time to properly implement things.

ES modules stats collection

You can run the stats collection with npm start --workspace stats from the root of the project.

The stats end up in stats/public, with naming conventions based on the file being looked at and what was asked to be exported.

rollup.config.js controls the Rollup build that'll output the statistics. It creates one entry to build per item in the stats.config.js file.

Those entries are purely virtual, not files on disk, (thanks to @rollup/plugin-virtual) and export either the whole module under scrutiny or specific exports from that module, so we can check tree-shaking.

Rollup outputs the built file to stdout. It's a bit verbose in the console, so we could either ignore it (> /dev/null) or could build into the tmp folder at the root of the project.

Stats serving in the review app

The review app adds a new middleware for /stats that serves the files themselves using express-static and uses serve-index to let us navigate between the different files exported.

CI workflow

The CI workflow, running after dist and package are built:

  • collects the stats
  • uploads the generated visualisation as an artifact of the action
  • comments on the PR

The commenting of the PR is split between:

  • comment-on-pr.js, the entry point exporting the function run by the GitHub Script action. It's mostly code for interacting with Github API, delegating the bulk of rendering the comment to generateStatsMessage()
  • generate-stats-message.js contains the code generating the comment with the stats.
  • lib/rollup-plugin-visualiser-stats.js contains the logic parsing the raw data from rollup-plugin-visualiser and aggregating it in a way that suits us

Thoughts and questions

  • The code in this PR needs to be thrown away and re-built tidily, this is just a proof of concept! Especially the workflow, which I wasn't quite sure how to organise so dist and package are re-built beforehand.
  • This PR shows what's feasible, but we don't necessarily need to do it all in one go. We could start by only looking at on-disk file size before diving deeper into analysing modules.
  • Looking at the ES module stats, we'll need to decide on some sorting and organisation for the content (bearing in mind we're limited to 65536 characters, we may want to put some info in the review app?)
  • The file sizes represent different things for dist and package. For the former, they're the actual size on disc. For the later, the size of the source. Not a bad thing, especially as they're relevant to what happens to each kind of files, but something to keep in mind.
  • rollup-plugin-visualizer will integrate with our build once we move to Rollup 3, so we could collect stats without making 1 build per file we want to look at. Making a new build is closer to how our files will be consumed though.
  • The file sizes returned by rollup-plugin-visualizer include the weight of the comments. In a way this makes sense as there's no guarantee that a bundler loading our file will have a minifier stripping comments, so these comments would contribute to the bundle size after all.
  • the raw data format exported from rollup-plugin-visualiser is not as well specified as that of Webpack. The project is also a much smaller scale. This may explain why many stats project rely on Webpack 🤔
  • The PR doesn't look into comparing the stats across branches, that's something we can look into separately.

Kind of tangential thought as well (and quite a heavy piece of work). We have 3 ways to build the project (build:compile, build:dist, build:package). This means the build runs 3 times on CI.
Could things be split up differently so we build only once, and then gather the results of the build in whichever way we want to ship them (ie. a "build" and then a "packaging" step?

The code is very dirty and just a buldozzed way to get through to what needed displaying to check feasability
Adds a new `/stats` route that'll point to the `public` folder of the `stats` project.
The route uses `express.static` to serve the files themselves, and `serve-index` to provide a folder navigation.
@govuk-design-system-ci govuk-design-system-ci temporarily deployed to govuk-frontend-pr-3239 February 2, 2023 15:37 Inactive
@github-actions
Copy link

github-actions bot commented Feb 2, 2023

Dist

File sizes represent minified code.

  • JavaScript: 44.94 KiB
  • CSS: 114.39 KiB
  • CSS (IE8): 81.91 KiB

Modules

File sizes represent non-minified code.

all.mjs: 136.94 KiB, 32 modules
Module Path Size Percentage
package/govuk-esm/common/index.mjs 5.57 KiB 4.07%
package/govuk-esm/common/normalise-dataset.mjs 1.29 KiB 0.94%
package/govuk-esm/common/closest-attribute-value.mjs 542 B 0.39%
package/govuk-esm/i18n.mjs 13.75 KiB 10.04%
package/govuk-esm/vendor/polyfills/Object/defineProperty.mjs 2.96 KiB 2.16%
package/govuk-esm/vendor/polyfills/Function/prototype/bind.mjs 8.06 KiB 5.89%
package/govuk-esm/vendor/polyfills/DOMTokenList.mjs 9.14 KiB 6.68%
package/govuk-esm/vendor/polyfills/Document.mjs 1.17 KiB 0.86%
package/govuk-esm/vendor/polyfills/Element.mjs 3.1 KiB 2.27%
package/govuk-esm/vendor/polyfills/Element/prototype/classList.mjs 3.48 KiB 2.54%
package/govuk-esm/vendor/polyfills/Element/prototype/dataset.mjs 2.53 KiB 1.85%
package/govuk-esm/vendor/polyfills/Element/prototype/matches.mjs 1.11 KiB 0.81%
package/govuk-esm/vendor/polyfills/Element/prototype/closest.mjs 952 B 0.68%
package/govuk-esm/vendor/polyfills/Element/prototype/nextElementSibling.mjs 817 B 0.58%
package/govuk-esm/vendor/polyfills/Element/prototype/previousElementSibling.mjs 841 B 0.60%
package/govuk-esm/vendor/polyfills/String/prototype/trim.mjs 712 B 0.51%
package/govuk-esm/vendor/polyfills/Window.mjs 744 B 0.53%
package/govuk-esm/vendor/polyfills/Event.mjs 6.71 KiB 4.90%
package/govuk-esm/vendor/polyfills/Date/now.mjs 566 B 0.40%
package/govuk-esm/components/accordion/accordion.mjs 16.48 KiB 12.04%
package/govuk-esm/components/button/button.mjs 2.37 KiB 1.73%
package/govuk-esm/components/details/details.mjs 4.18 KiB 3.05%
package/govuk-esm/components/character-count/character-count.mjs 15.87 KiB 11.59%
package/govuk-esm/components/checkboxes/checkboxes.mjs 5.6 KiB 4.09%
package/govuk-esm/components/error-summary/error-summary.mjs 5.91 KiB 4.32%
package/govuk-esm/components/notification-banner/notification-banner.mjs 2.01 KiB 1.47%
package/govuk-esm/components/header/header.mjs 3.01 KiB 2.20%
package/govuk-esm/components/radios/radios.mjs 4.09 KiB 2.99%
package/govuk-esm/components/skip-link/skip-link.mjs 2.48 KiB 1.81%
package/govuk-esm/components/tabs/tabs.mjs 7.71 KiB 5.63%
package/govuk-esm/all.mjs 3.29 KiB 2.40%
all.mjs (import {Accordion}): 67.59 KiB, 13 modules
Module Path Size Percentage
package/govuk-esm/common/index.mjs 4.92 KiB 7.28%
package/govuk-esm/common/normalise-dataset.mjs 1.29 KiB 1.91%
package/govuk-esm/i18n.mjs 13.75 KiB 20.35%
package/govuk-esm/vendor/polyfills/Object/defineProperty.mjs 2.96 KiB 4.38%
package/govuk-esm/vendor/polyfills/Function/prototype/bind.mjs 8.06 KiB 11.93%
package/govuk-esm/vendor/polyfills/DOMTokenList.mjs 9.14 KiB 13.52%
package/govuk-esm/vendor/polyfills/Document.mjs 1.17 KiB 1.74%
package/govuk-esm/vendor/polyfills/Element.mjs 3.1 KiB 4.59%
package/govuk-esm/vendor/polyfills/Element/prototype/classList.mjs 3.48 KiB 5.15%
package/govuk-esm/vendor/polyfills/Element/prototype/dataset.mjs 2.53 KiB 3.74%
package/govuk-esm/vendor/polyfills/String/prototype/trim.mjs 712 B 1.03%
package/govuk-esm/components/accordion/accordion.mjs 16.48 KiB 24.38%
components/accordion/accordion.mjs: 67.59 KiB, 13 modules
Module Path Size Percentage
package/govuk-esm/common/index.mjs 4.92 KiB 7.28%
package/govuk-esm/common/normalise-dataset.mjs 1.29 KiB 1.91%
package/govuk-esm/i18n.mjs 13.75 KiB 20.35%
package/govuk-esm/vendor/polyfills/Object/defineProperty.mjs 2.96 KiB 4.38%
package/govuk-esm/vendor/polyfills/Function/prototype/bind.mjs 8.06 KiB 11.93%
package/govuk-esm/vendor/polyfills/DOMTokenList.mjs 9.14 KiB 13.52%
package/govuk-esm/vendor/polyfills/Document.mjs 1.17 KiB 1.74%
package/govuk-esm/vendor/polyfills/Element.mjs 3.1 KiB 4.59%
package/govuk-esm/vendor/polyfills/Element/prototype/classList.mjs 3.48 KiB 5.15%
package/govuk-esm/vendor/polyfills/Element/prototype/dataset.mjs 2.53 KiB 3.74%
package/govuk-esm/vendor/polyfills/String/prototype/trim.mjs 712 B 1.03%
package/govuk-esm/components/accordion/accordion.mjs 16.48 KiB 24.38%

View stats on the review app

Interactive visualisation can be downloaded from the build's artifacts

@domoscargin
Copy link
Contributor

This looks great!

A few thoughts from me:

  • Since the file size work is pretty separate from the rollup stats, I agree it makes sense to just get that up and running quickly while we finesse the murkier module stuff
  • Part of that finessing would include relative tracking
    • Seeing the file sizes/module sizes/module numbers compared against the base branch (presumably just main).
    • Flagging any anomalies somehow (eg: highlighting a big difference in file size via red, bold text or something)
    • Potentially for the generated comment, we'd ONLY highlight modules with a significant change, so that would help with character limit.
    • Building up a collection of stats data to use for trend graphs and analysis
  • I think this works better as an informational step, but we might also want to consider creating a relative size increase limit that will fail the build, but could be manually overridden somehow (for when we're adding a big new component, or new behaviour, etc)
  • Very mild issue with the potential difference of using a newer Rollup vs the older one we use for actual builds, but I think it's not an issue since we're measuring relative trends here, not precise sizes.
  • Having stats in the review app is such a nice touch!

@romaricpascal romaricpascal marked this pull request as ready for review February 6, 2023 17:04
@romaricpascal romaricpascal requested a review from a team as a code owner February 6, 2023 17:04
@domoscargin
Copy link
Contributor

It'd be good to come up with next steps as an output of this spike. And especially good if we could split the work up so folks can work on different bits.

I imagine we can at least split the file size stuff from the rollup stuff. Maybe also split the comment generation into its own bit of work.

@36degrees
Copy link
Contributor

36degrees commented Feb 8, 2023

This is ace!

A few thoughts from me too, in no particular order:

  • If we can get enough 'headline stats' to give us confidence when making changes, I think we should consider pushing anything that gets easier post v5 (when we can upgrade rollup) back until later.
  • Is it worth including sizes after gzip / brotli compression for the dist files?
  • Any idea why 'build dist' and 'build package' take so much longer than 'dist'? (~1m 20s vs ~30s)
  • The 'comment on PR' logic is really neat! Wondering if we could make it reusable in other contexts if possible – maybe as a workflow that accepts a marker and the comment text?
  • It might make sense to use a comment (e.g. <!-- Performance metrics --> as the marker? (The name of the constant suggests this might have been the original intent?)
  • It might also be worth including the SHA of the head commit somewhere so that when e.g. pushing new commits you can tell if it's been updated?
  • I'm wondering if it's worth creating a custom template for the serve-index pages, as they're not great from an accessibility point of view – the colour contrast isn't great and there are some focusable elements without any focus indicators, for example. Reckon we could use the CSS from the review app and just change the markup to use our existing classes?

It'd be good to come up with next steps as an output of this spike. And especially good if we could split the work up so folks can work on different bits.

+1 – let's make sure we do this. @romaricpascal do you think you'd be able to draft some issues, or would it be useful to find some time to do them together?

@romaricpascal
Copy link
Member Author

@domoscargin @36degrees Cheers both for the comments 😊

It'd be good to come up with next steps as an output of this spike. And especially good if we could split the work up so folks can work on different bits.

+1 – let's make sure we do this. @romaricpascal do you think you'd be able to draft some issues, or would it be useful to find some time to do them together?

I was thinking of drafting a list of steps and posting them on Slack for discussion before creating actual issues.

@romaricpascal
Copy link
Member Author

Spike is complete and next steps are listed in #3279

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gather performance stats with rollup-plugin-visualiser
4 participants