core(byte-efficiency): use log-normal distribution scoring #14977

robatron · 2023-04-15T05:51:26Z

Hi all!

This is my first PR to Lighthouse, and my first experience with log-normal distributions and curve-fitting, so please bear with me 😅

My intention is to address issue #11883. (Selected from the good first issue label.) Hopefully my approach isn't too far off, but in case it is, I hope this PR can still serve as a starting point for someone more familiar with these topics (or at least spark some discussion!)

Summary
Details
Conclusion

Summary

This PR updates ByteEfficiencyAudits to use standard log-normal distribution scoring which replaces a custom method based on a linear piecewise function. Parameters for the log-normal distribution scoring were selected by fitting its CDF curve to the original linear piecewise function using an online curve fitting tool.

Details

The ByteEfficiencyAudit.scoreForWastedMs() function uses a non-standard scoring method defined by linear interpolation between control points - (0, 1), (300, 0.75), (750, 0.5), and (5000, 0) - forming a linear piecewise function.

desmos.com/dpuygs6bdh

To use log-normal distribution scoring, we need to find values for the two required parameters, p10 and median, such that the curve of the log-normal CDF fits this linear piecewise function. As suggested by @koraa in #11883, we can find these values with a function-fitting tool.

In this case, I opted for an online function fitting tool, MyCurveFit. MyCurveFit accepts a list of data points and a "fit function" containing coefficients, then computes the optimal coefficient values that best align the fit function with the given data points.

Generating data points for the linear piecewise function

MyCurveFit requires a list of up to 500 data points to fit the curve of the fit function. To approximate our linear piecewise function, we can use a spreadsheet to linearly interpolate between the control points (calculating the piecewise function's value at every 10th point between 0 and 5,000.)

Note: I found that relying solely on the control points as the target led to a suboptimal fit b/c the curve-fitting process would focus on those points only, ignoring the points in between. Generating additional data points through linear interpolation as described above resulted in a better fit. Based on this, I wonder if 500 data points is enough for an accurate fit? Or are there better curve-fitting tools that would accept continuous functions as their target? 🤔

Defining a fit function from the log-normal CDF

MyCurveFit now needs a "fit function" that will be fit to the data points of the linear piecewise function. Deriving this fit function from the log-normal CDF used in getLogNormalScore(), we get:

y = (1 - erf((ln(x / m) * 0.9061938024368232) / ln(m / p))) / 2;

... which was derived like so:

Start with the original log-normal CDF code from statistics.js:

const INVERSE_ERFC_ONE_FIFTH = 0.9061938024368232;
const xRatio = Math.max(Number.MIN_VALUE, value / median);
const xLogRatio = Math.log(xRatio);
const p10Ratio = Math.max(Number.MIN_VALUE, p10 / median);
const p10LogRatio = -Math.log(p10Ratio);
const standardizedX = (xLogRatio * INVERSE_ERFC_ONE_FIFTH) / p10LogRatio;
const complementaryPercentile = (1 - erf(standardizedX)) / 2;

Substitute standardizedX:

(1 - erf((xLogRatio * INVERSE_ERFC_ONE_FIFTH) / p10LogRatio)) / 2;

Substitute xLogRatio, INVERSE_ERFC_ONE_FIFTH, and p10LogRatio:

(1 - erf((Math.log(xRatio) * 0.9061938024368232) / -Math.log(p10Ratio))) / 2;

Substitute xRatio, and p10Ratio:

(1 -
  erf(
    (Math.log(Math.max(Number.MIN_VALUE, value / median)) *
      0.9061938024368232) /
      -Math.log(Math.max(Number.MIN_VALUE, p10 / median))
  )) /
  2;

Remove Math.max(Number.MIN_VALUE, ...) (b/c we assume value, median, and p10 are all > 0)

(1 -
  erf(
    (Math.log(value / median) * 0.9061938024368232) / -Math.log(p10 / median)
  )) /
  2;

[Final step] Substitute JavaScript → MyCurveFit syntax: Replace Math.log() → ln(), -ln(A / B) → ln(B / A), value → x (independent variable), median → m, and p10→p` (coefficients)
```
(1 - erf((ln(x / m) * 0.9061938024368232) / ln(m / p))) / 2;
```

To validate the derived fit function, we can plot it alongside the original log-normal CDF (defined in desmos.com/o98tbeyt1t from statistics.getLogNormalScore()). We would expect them to produce identical curves, which they do ✅

Original log-normal CDF (in blue) and the derived fit function (in red) producing identical curves (desmos.com/dpuygs6bdh)

Fitting the log-normal CDF to the linear piecewise function

Now that we've entered our data points and fit function into MyCurvefit, we can perform the curve fitting process. This will provide the best-fitting values for our coefficients, p10 and median, so that the fit function matches the data points as closely as possible.

Results:

p10 = 150.2663
median = 934.7609

Results of fitting the log-normal CDF to the linear piecewise function (MyCurveFit)

Updating `ByteEfficiencyAudit` to use log-normal distribution scoring

With the values for p10 and median obtained from the curve-fitting process described above, we can now update ByteEfficiencyAudit.scoreForWastedMs() to use the standard log-normal distribution scoring method:

// core/audits/byte-efficiency/byte-efficiency-audit.js

const WASTED_MS_P10 = 150;
const WASTED_MS_MEDIAN = 935;

// ...

class ByteEfficiencyAudit extends Audit {
  static scoreForWastedMs(wastedMs) {
    return Audit.computeLogNormalScore(
      { p10: WASTED_MS_P10, median: WASTED_MS_MEDIAN },
      wastedMs
    );
  }

  // ...
}

Conclusion

In this PR, I aimed to address issue #11883 by replacing the custom linear piecewise function scoring method in ByteEfficiencyAudits with standard log-normal distribution scoring. I obtained the required parameters (p10 and median) for the log-normal distribution by fitting its CDF curve to the original linear piecewise function using an online curve fitting tool, MyCurveFit.

(Again, if my approach is off track, I hope this PR can still serve as a starting point for someone more familiar with these topics or at least spark some discussion ✨)

Thank you for your time reviewing this PR. I look forward to your feedback!

adamraine

Thanks for putting in a lot of work here @robatron! This looks good to me.

@brendankenny do you have any thoughts?

adamraine · 2023-05-18T23:04:03Z

@robatron looks like you need to rebase and update a few unit tests for CI to pass

brendankenny · 2023-05-18T23:11:31Z

@brendankenny do you have any thoughts?

Nope! Seems reasonable and the resulting curve is a nice fit @robatron!

I think the only thing is how it will affect scores in practice, which we should see when the tests are updated. The curve is well behaved and similar enough to the existing one that I can't imagine anything fatal will pop up.

@robatron you'll need to run yarn update:sample-json to update a bunch of stuff at once, and then there will likely be a few unit test stragglers asserting results that might need an update as well.

brendankenny · 2023-05-18T23:12:25Z

core/audits/byte-efficiency/byte-efficiency-audit.js

+// log-normal cumulative distribution function curve to the former method of linear interpolation
+// scoring between the control points {average = 300 ms, poor = 750 ms, zero = 5000 ms} using the
+// curve-fit tool at https://mycurvefit.com/ rounded to the nearest integer. See
+// https://www.desmos.com/calculator/gcexiyesdi for an interactive visualization of the curve fit.


great demo!

…oogleChrome#11883)

… LTS)

robatron · 2023-05-20T17:23:23Z

@robatron you'll need to run yarn update:sample-json to update a bunch of stuff at once, and then there will likely be a few unit test stragglers asserting results that might need an update as well.

@brendankenny, done! Tests are all passing for me locally now ✅

robatron · 2023-05-20T17:58:10Z

package.json

@@ -10,7 +10,7 @@
    "smokehouse": "./cli/test/smokehouse/frontends/smokehouse-bin.js"
  },
  "engines": {
-    "node": ">=16.16"
+    "node": ">=16.16 <19"


I noticed tests fail part way through with confusing engine errors in node v19, but work fine v18 (latest LTS). Thought it might be good to cap the node version compatibility for friendlier error messages to help avoid confusion. What do you all think?

Before 👎

[I] ➜ node --version v19.8.0 [I] ➜ yarn test yarn run v1.22.18 < ... about 450 lines later ... > $ bash flow-report/test/run-flow-report-tests.sh +++ dirname flow-report/test/run-flow-report-tests.sh ++ cd flow-report/test ++ pwd + SCRIPT_DIR=/Users/robmc/code/lighthouse/flow-report/test + LH_ROOT=/Users/robmc/code/lighthouse/flow-report/test/../.. + ARGS=(--testMatch='{flow-report/**/*-test.ts,flow-report/**/*-test.tsx}' --require="$LH_ROOT/flow-report/test/setup/env-setup.ts") + cd /Users/robmc/code/lighthouse/flow-report/test/../.. + node --loader=@esbuild-kit/esm-loader core/test/scripts/run-mocha-tests.js '--testMatch={flow-report/**/*-test.ts,flow-report/**/*-test.tsx}' --require=/Users/robmc/code/lighthouse/flow-report/test/../../flow-report/test/setup/env-setup.ts (node:66690) ExperimentalWarning: Custom ESM Loaders is an experimental feature and might change at any time (Use `node --trace-warnings ...` to show where the warning was created) running 12 test files /usr/local/Cellar/node/19.8.0/bin/node[66690]: ../src/module_wrap.cc:599:MaybeLocal<v8::Promise> node::loader::ImportModuleDynamically(Local<v8::Context>, Local<v8::Data>, Local<v8::Value>, Local<v8::String>, Local<v8::FixedArray>): Assertion `(it) != (env->id_to_function_map.end())' failed. 1: 0x104bd587c node::Abort() [/usr/local/Cellar/node/19.8.0/bin/node] 2: 0x104bd5867 node::Abort() [/usr/local/Cellar/node/19.8.0/bin/node] 3: 0x104b97745 node::loader::ImportModuleDynamically(v8::Local<v8::Context>, v8::Local<v8::Data>, v8::Local<v8::Value>, v8::Local<v8::String>, v8::Local<v8::FixedArray>) [/usr/local/Cellar/node/19.8.0/bin/node] 4: 0x104e6006f v8::internal::Isolate::RunHostImportModuleDynamicallyCallback(v8::internal::MaybeHandle<v8::internal::Script>, v8::internal::Handle<v8::internal::Object>, v8::internal::MaybeHandle<v8::internal::Object>) [/usr/local/Cellar/node/19.8.0/bin/node] 5: 0x1051d3ac3 v8::internal::Runtime_DynamicImportCall(int, unsigned long*, v8::internal::Isolate*) [/usr/local/Cellar/node/19.8.0/bin/node] 6: 0x104a162b4 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvInRegister_NoBuiltinExit [/usr/local/Cellar/node/19.8.0/bin/node] flow-report/test/run-flow-report-tests.sh: line 20: 66690 Abort trap: 6 node --loader=@esbuild-kit/esm-loader core/test/scripts/run-mocha-tests.js ${ARGS[*]} "$@" error Command failed with exit code 134.

After 👍

[I] ➜ node --version v19.8.0 [I] ➜ yarn test yarn run v1.22.18 error [email protected]: The engine "node" is incompatible with this module. Expected version ">=16.16 <19". Got "19.8.0" error Commands cannot run with an incompatible environment.

This needs to be done in a breaking change. I'll add a note to address this in #14909 but we should remove it in this PR along with the .nvmrc.

I can repro in 19.8.0, but only in that version. @robatron looks like you were unlucky ending up on that version and not 19.8.1 which was released the next day with what looks like a fix: https://github.com/nodejs/node/blob/main/doc/changelogs/CHANGELOG_V19.md#2023-03-15-version-1981-current-targos

Lol, yep, > 19.8.0 seems to work fine. @adamraine , I reverted this and removed the .nvmrc.

.nvmrc

… (latest LTS)" This reverts commit 29a9a48.

robatron requested a review from a team as a code owner April 15, 2023 05:51

robatron requested review from adamraine and removed request for a team April 15, 2023 05:51

robatron changed the title ~~Update ByteEfficiencyAudits to use log-normal distribution scoring (#11883)~~ Update ByteEfficiencyAudits to use log-normal distribution scoring Apr 15, 2023

vercel bot deployed to Preview April 15, 2023 05:52 View deployment

devtools-bot assigned adamraine Apr 15, 2023

devtools-bot added the waiting4reviewer label Apr 15, 2023

robatron changed the title ~~Update ByteEfficiencyAudits to use log-normal distribution scoring~~ core(byte-efficiency): Update ByteEfficiencyAudits to use log-normal distribution scoring Apr 15, 2023

robatron changed the title ~~core(byte-efficiency): Update ByteEfficiencyAudits to use log-normal distribution scoring~~ core(byte-efficiency): Use log-normal dist. scoring in ByteEfficiencyAudits Apr 15, 2023

robatron changed the title ~~core(byte-efficiency): Use log-normal dist. scoring in ByteEfficiencyAudits~~ core(byte-efficiency): use log-normal dist. scoring in ByteEfficiencyAudits Apr 15, 2023

adamraine approved these changes May 2, 2023

View reviewed changes

brendankenny reviewed May 18, 2023

View reviewed changes

Update ByteEfficiencyAudits to use log-normal distribution scoring (G…

f76e799

…oogleChrome#11883)

robatron force-pushed the byte-efficiency-audit-log-normal-score branch from 5068e33 to f76e799 Compare May 20, 2023 16:46

vercel bot deployed to Preview May 20, 2023 16:47 View deployment

Update sample Lighthouse result JSON

c8be0d1

vercel bot deployed to Preview May 20, 2023 16:52 View deployment

robatron added 2 commits May 20, 2023 10:18

Update unit test snapshots

b6fc326

Specify node version <19 in package.json, add .nvmrc with v18 (latest…

29a9a48

… LTS)

vercel bot deployed to Preview May 20, 2023 17:19 View deployment

robatron commented May 20, 2023

View reviewed changes

.nvmrc Outdated Show resolved Hide resolved

Revert "Specify node version <19 in package.json, add .nvmrc with v18…

6c2c98e

… (latest LTS)" This reverts commit 29a9a48.

vercel bot deployed to Preview May 22, 2023 17:42 View deployment

adamraine changed the title ~~core(byte-efficiency): use log-normal dist. scoring in ByteEfficiencyAudits~~ core(byte-efficiency): use log-normal distribution scoring May 22, 2023

adamraine merged commit 0ab01b8 into GoogleChrome:main May 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core(byte-efficiency): use log-normal distribution scoring #14977

core(byte-efficiency): use log-normal distribution scoring #14977

robatron commented Apr 15, 2023 •

edited

Loading

adamraine left a comment

adamraine commented May 18, 2023

brendankenny commented May 18, 2023

brendankenny May 18, 2023 •

edited

Loading

robatron commented May 20, 2023 •

edited

Loading

robatron May 20, 2023 •

edited

Loading

adamraine May 22, 2023

brendankenny May 22, 2023

robatron May 22, 2023

core(byte-efficiency): use log-normal distribution scoring #14977

core(byte-efficiency): use log-normal distribution scoring #14977

Conversation

robatron commented Apr 15, 2023 • edited Loading

Summary

Details

Generating data points for the linear piecewise function

Defining a fit function from the log-normal CDF

Fitting the log-normal CDF to the linear piecewise function

Updating ByteEfficiencyAudit to use log-normal distribution scoring

Conclusion

adamraine left a comment

Choose a reason for hiding this comment

adamraine commented May 18, 2023

brendankenny commented May 18, 2023

brendankenny May 18, 2023 • edited Loading

Choose a reason for hiding this comment

robatron commented May 20, 2023 • edited Loading

robatron May 20, 2023 • edited Loading

Choose a reason for hiding this comment

adamraine May 22, 2023

Choose a reason for hiding this comment

brendankenny May 22, 2023

Choose a reason for hiding this comment

robatron May 22, 2023

Choose a reason for hiding this comment

robatron commented Apr 15, 2023 •

edited

Loading

Updating `ByteEfficiencyAudit` to use log-normal distribution scoring

brendankenny May 18, 2023 •

edited

Loading

robatron commented May 20, 2023 •

edited

Loading

robatron May 20, 2023 •

edited

Loading