new_audit: add TTI Companion Metric to JSON #8975

deepanjanroy · 2019-05-16T21:09:04Z

PTAL - this implements the Sum of queueing time beyond 50ms TTI Companion Metric. A few questions / things I want to call out:

We don't have a good name for this metric right now. Currently we're calling it Cumulative Long Queuing Delay (Cumulative instead of total/sum-of because we use the word 'cumulative' in 'cumulative layout jank'.) Didn't want to use the word 'jank' either because of how overloaded it is. If you have better ideas for the name, please let me know, but we can probably name it better later if we decide to ship it.
This is currently showing up on the report page. How do I stop doing that?
I haven't fixed the proto-test yet. Wanted to get some more high level feedback before I spent time fixing that.
Currently measuring it between FCP and TTI.
- FCP as lower bound: Penalizing for long tasks before FCP felt counter-productive, because you should probably optimize for fast FCP and use up all the main thread you need to do that.
- TTI as upper bound: There was very little different between TTI and TTI + 10s, so using TTI is because the tests can run faster. I wish we could remove the dependency on TTI, but don't have a good idea on how to do that.

lighthouse-core/test/results/sample_v2.json

deepanjanroy · 2019-05-16T21:12:35Z

Also, looking at the pr title lint - what's a good title for this? report, metric, or misc?

patrickhulce · 2019-05-16T21:17:13Z

Awesome thanks so much for this @deepanjanroy!

If you have better ideas for the name, please let me know, but we can probably name it better later if we decide to ship it.

I believe @hoten come up with first ever pronounceable metric acronym if he'd like to share it :)

This is currently showing up on the report page. How do I stop doing that?

You should be able to just remove the group property in the config entry.

Penalizing for long tasks before FCP felt counter-productive

Big +1 👍

I wish we could remove the dependency on TTI, but don't have a good idea on how to do that.

IMO the only way we're going to be able to do this is something like the old fixed cost curve from eFID. That has the nice property that dragging out the trace doesn't end up affecting things much but tends to underpenalize bad experiences that happen very late, so I'm not sure how to resolve that issue really. On the one hand, pushing out the bad experience means people are indeed much less likely to experience it but if it can wait that long it probably should have just been broken up and run earlier in the first place soooooo... 🤷‍♂

pr title lint

I think we'll go with new_audit: add TTI companion to JSON or something similar that we did with Max potential FID at first? #5842

deepanjanroy · 2019-05-16T21:37:27Z

You should be able to just remove the group property in the config entry.
Done.

I wish we could remove the dependency on TTI, but don't have a good idea on how to do that.

IMO the only way we're going to be able to do this is something like the old fixed cost curve from eFID. That has the nice property that dragging out the trace doesn't end up affecting things much but tends to underpenalize bad experiences that happen very late, so I'm not sure how to resolve that issue really. On the one hand, pushing out the bad experience means people are indeed much less likely to experience it but if it can wait that long it probably should have just been broken up and run earlier in the first place soooooo...

Hm the downside of fixed cost curve is assuming the same distribution of all sites, and building a more sophisticated model (maybe taking into account other times when key big objects are first painted as opposed to just the first contentful paint - I think you suggested something like that) is not easy. For now I think taking it until TTI is probably the best tradeoff.

pr title lint

I think we'll go with new_audit: add TTI companion to JSON or something similar that we did with Max potential FID at first? #5842

Done - thanks!

patrickhulce

this looks great @deepanjanroy awesome job navigating the confusing world of simulated/observed LH metrics! 🎉 👏 👏

lighthouse-core/audits/metrics/cumulative-long-queuing-delay.js

lighthouse-core/computed/metrics/cumulative-long-queuing-delay.js

lighthouse-core/test/audits/metrics/cumulative-long-queuing-delay-test.js

patrickhulce · 2019-05-16T21:35:55Z

lighthouse-core/test/audits/metrics/cumulative-long-queuing-delay-test.js

+    const settings = {throttlingMethod: 'provided'};
+    const context = {options, settings, computedCache: new Map()};
+
+    return Audit.audit(artifacts, context).then(output => {


nit: async/await and expect for new tests

though @brendankenny do you disagree with using expect for all new tests?

though @brendankenny do you disagree with using expect for all new tests?

no, it sounds good to me. I've (mostly) been matching the assert or expect used in existing test files just because I personally don't like when they're mixed, but totally 👍👍 for expect in new test files

Didn't realized I was mixing node assert and jest expects. Changed them all to jest expect since it has a nicer API.

oh wait sorry, still super nit but would prefer async/await` for new tests too :)

Whoops forgot to do async await last time. Done.

lighthouse-core/test/computed/metrics/cumulative-long-queuing-delay-test.js

patrickhulce · 2019-05-16T21:38:25Z

lighthouse-core/test/computed/metrics/cumulative-long-queuing-delay-test.js

+
+    it('only looks at tasks within FMP and TTI', () => {
+      const events = [
+        // TODO(deepanjanroy@): Is there an interval data structure in lighthouse?


good point, not really at the moment

Hm ok. I'm going to punt on introducing that for now.

connorjclark · 2019-05-16T21:49:02Z

I believe @hoten come up with first ever pronounceable metric acronym if he'd like to share it :)

:)

Time In Long Tasks (TILT). It's not comprehensive, but it gets the point across.

googlebot · 2019-05-21T19:02:24Z

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of all the commit author(s), set the cla label to yes (if enabled on your project), and then merge this pull request when appropriate.

ℹ️ Googlers: Go here for more info.

deepanjanroy · 2019-05-21T20:17:31Z

Time In Long Tasks (TILT). It's not comprehensive, but it gets the point across.

I love the acronym, but I worry it's too easy to interpret is as sum of long task durations, as opposed to sum of times beyond 50ms. I'm currently calling the "time beyond 50ms" as "long queuing delay region", and so the sum of all these regions are "cumulative long queuing delay". If we can find a better name for the "time beyond 50ms in tasks", a better name for the whole metric should follow from there.

deepanjanroy · 2019-05-21T20:20:11Z

Updated everything except the proto test (working on that) so please take another look.

As a tangent: I'm not sure if I've been spamming everyone here. I never used the new github review UI before, and I'm still figuring out how to batch comments properly; let me know if there are any conventions you follow.

connorjclark · 2019-05-21T21:29:13Z

I love the acronym, but I worry it's too easy to interpret is as sum of long task durations, as opposed to sum of times beyond 50ms. I'm currently calling the "time beyond 50ms" as "long queuing delay region", and so the sum of all these regions are "cumulative long queuing delay". If we can find a better name for the "time beyond 50ms in tasks", a better name for the whole metric should follow from there.

Thought about it a bit more... landed on an alternative to "long queuing delay region" - what about just "excessive"? Then we could do ETILT :) The excessive qualifies the time measurement, and suggests that it's not just a straight total.

As a tangent: I'm not sure if I've been spamming everyone here. I never used the new github review UI before, and I'm still figuring out how to batch comments properly; let me know if there are any conventions you follow.

Totally fine.

lighthouse-core/audits/metrics/cumulative-long-queuing-delay.js

lighthouse-core/computed/metrics/cumulative-long-queuing-delay.js

exterkamp · 2019-05-24T00:05:37Z

lighthouse-core/audits/metrics/cumulative-long-queuing-delay.js

+
+const UIStrings = {
+  title: 'Cumulative Long Queuing Delay',
+  description: '[Experimental metric]. Sum of Task Lengths beyond 50ms, between ' +


This description seems ambiguous. Is it the sum of tasks that are beyond 50ms in length? Or the sum of task lengths minus 50ms?

Ideas:
Sum of task lengths that occur between FCP and TTI that delay additional task queuing by more than 50ms.
Or make it less technical?
Sum of tasks that delay additional task queuing. This is an estimate of task delay based on mainthread busyness. <- don't know if thats really true 😆 just thinking out loud for ideas.

Hm you're right. Reworded to "[Experimental metric] Total time period between FCP and Time to Interactive during which queuing time for any input event would be higher than 50ms.' Is that clear enough?

patrickhulce

I think this LGTM from an HTTPArchive data gathering perspective! I'm super curious how the percentiles hold up there.

patrickhulce · 2019-05-24T03:48:42Z

lighthouse-core/audits/metrics/cumulative-long-queuing-delay.js

+      // was 0ms. These numbers include 404 pages, so rounding up the scoreMedian to 300ms and
+      // picking 25ms as PODR. See curve at https://www.desmos.com/calculator/x3nzenjyln
+      scoreMedian: 300,
+      scorePODR: 25,


These are some seriously fast pages 😮

patrickhulce · 2019-05-24T03:49:59Z

lighthouse-core/audits/metrics/cumulative-long-queuing-delay.js

+      // sites on mobile, 25-th percentile was 270ms, 10-th percentile was 22ms, and 5th percentile
+      // was 0ms. These numbers include 404 pages, so rounding up the scoreMedian to 300ms and
+      // picking 25ms as PODR. See curve at https://www.desmos.com/calculator/x3nzenjyln
+      scoreMedian: 300,


this does feel a bit weird that a single 350 ms task immediately after FCP and total mainthread silence after that already gives you a 50

if we were to start breaking our 75/95 rule, I think I'd want to start it here...

Changed to 600/200. We now have 600ms of jank = 0.5, and 100ms of jank = .999. 350 ms task will have 300ms of jank and get a .885.

lighthouse-core/audits/metrics/cumulative-long-queuing-delay.js

lighthouse-core/test/audits/__snapshots__/metrics-test.js.snap

patrickhulce · 2019-05-24T03:57:22Z

lighthouse-core/test/audits/metrics/cumulative-long-queuing-delay-test.js

+    const settings = {throttlingMethod: 'provided'};
+    const context = {options, settings, computedCache: new Map()};
+
+    return Audit.audit(artifacts, context).then(output => {


oh wait sorry, still super nit but would prefer async/await` for new tests too :)

patrickhulce · 2019-05-24T04:01:00Z

lighthouse-core/test/computed/metrics/cumulative-long-queuing-delay-test.js

+
+      const events = [
+        {start: 1000, end: 1110, duration: 110}, // Contributes 10ms.
+        {start: 2000, end: 2100, duration: 100}, // Contributes 50ms.


wouldn't this only contribute 50 anyway, sans clipping?

I guess the whole idea of TTI occurring in the middle of a task isn't really possible though anyhow :)

Ah yes it would. Changed a test case a little bit to test clipping better.

lighthouse-core/computed/metrics/lantern-cumulative-long-queuing-delay.js

Co-Authored-By: Patrick Hulce <[email protected]>

…ng-delay.js Co-Authored-By: Patrick Hulce <[email protected]>

…lay-test.js Co-Authored-By: Patrick Hulce <[email protected]>

…delay-test.js Co-Authored-By: Patrick Hulce <[email protected]>

deepanjanroy

Addressed comments and fixed tests.

I made a bikeshedding doc here for the name of the metric (internal because we haven't had much public discussions around this metric yet). Feel free to edit directly and discuss there, but I don't want to block landing the metric over naming it since it's all very experimental right now.

Also, how do I make the cla/google check happy?

Please take a look!

lighthouse-core/audits/metrics/cumulative-long-queuing-delay.js

deepanjanroy · 2019-05-27T15:05:32Z

lighthouse-core/audits/metrics/cumulative-long-queuing-delay.js

+
+const UIStrings = {
+  title: 'Cumulative Long Queuing Delay',
+  description: '[Experimental metric]. Sum of Task Lengths beyond 50ms, between ' +


Hm you're right. Reworded to "[Experimental metric] Total time period between FCP and Time to Interactive during which queuing time for any input event would be higher than 50ms.' Is that clear enough?

lighthouse-core/computed/metrics/cumulative-long-queuing-delay.js

deepanjanroy · 2019-05-28T13:29:25Z

lighthouse-core/audits/metrics/cumulative-long-queuing-delay.js

+      // sites on mobile, 25-th percentile was 270ms, 10-th percentile was 22ms, and 5th percentile
+      // was 0ms. These numbers include 404 pages, so rounding up the scoreMedian to 300ms and
+      // picking 25ms as PODR. See curve at https://www.desmos.com/calculator/x3nzenjyln
+      scoreMedian: 300,


Changed to 600/200. We now have 600ms of jank = 0.5, and 100ms of jank = .999. 350 ms task will have 300ms of jank and get a .885.

lighthouse-core/computed/metrics/lantern-cumulative-long-queuing-delay.js

lighthouse-core/test/audits/__snapshots__/metrics-test.js.snap

deepanjanroy · 2019-05-28T14:05:20Z

lighthouse-core/test/audits/metrics/cumulative-long-queuing-delay-test.js

+    const settings = {throttlingMethod: 'provided'};
+    const context = {options, settings, computedCache: new Map()};
+
+    return Audit.audit(artifacts, context).then(output => {


Whoops forgot to do async await last time. Done.

deepanjanroy · 2019-05-28T14:07:09Z

lighthouse-core/test/computed/metrics/cumulative-long-queuing-delay-test.js

+
+      const events = [
+        {start: 1000, end: 1110, duration: 110}, // Contributes 10ms.
+        {start: 2000, end: 2100, duration: 100}, // Contributes 50ms.


Ah yes it would. Changed a test case a little bit to test clipping better.

deepanjanroy · 2019-05-29T16:43:10Z

Ping on this: Anything else I need to do for merging?

patrickhulce · 2019-05-29T16:51:35Z

Sorry @deepanjanroy for the silence! My approval still stands, ideally it gets a onceover from another "metricy" person on the team.

@brendankenny @paulirish any takers?

brendankenny

some ~~superficial~~ style/nits feedback, but looks great to me.

lighthouse-core/test/audits/metrics/cumulative-long-queuing-delay-test.js

brendankenny · 2019-06-04T00:14:32Z

lighthouse-core/test/audits/metrics/cumulative-long-queuing-delay-test.js

+    const output = await Audit.audit(artifacts, context);
+
+    expect(output.numericValue).toBeCloseTo(48.3, 1);
+    expect(output.score).toBeCloseTo(1, 2);


maybe add a comment here that it's expected to be // very nearly 1 or something if it really isn't expected to reach 1

It actually does reach 1. Changed to .toBe(1).

lighthouse-core/computed/metrics/cumulative-long-queuing-delay.js

brendankenny · 2019-06-04T00:34:07Z

lighthouse-core/computed/metrics/lantern-cumulative-long-queuing-delay.js

+  static getEstimateFromSimulation(simulation, extras) {
+    // Intentionally use the opposite FCP estimate, a more pessimistic FCP means that more tasks are
+    // excluded from the CumulativeLongQueuingDelay computation, so a higher FCP means lower value
+    // for the same work.


this gets a bit confusing. What about something like

// Intentionally use the opposite FCP estimate. A pessimistic FCP is higher than an optimistic FCP, // which means more tasks are excluded from the CumulativeLongQueuingDelay computation. So a more // pessimistic FCP gives a more optimistic CumulativeLongQueuingDelay for the same work.

A pessimistic FCP is higher than an optimistic FCP

is that strictly true? I don't know :)

is that strictly true?

pessimistic FCP is strictly higher than or equal to optimistic FCP :)

It includes everything in optimistic plus the script initiated render-blocking priority requests

Reworded this comment (and the comment below about TTI.)

brendankenny · 2019-06-04T00:35:45Z

lighthouse-core/computed/metrics/lantern-cumulative-long-queuing-delay.js

+      : extras.interactiveResult.pessimisticEstimate.timeInMs;
+
+    // Require here to resolve circular dependency.
+    const CumulativeLongQueuingDelay = require('./cumulative-long-queuing-delay.js');


nit: move down to above the return statement`

I'm using the CumulativeLongQueuingDelay.LONG_QUEUING_DELAY_THRESHOLD constant unfortunately, so this is the furthest I can move it down.

brendankenny · 2019-06-04T00:38:34Z

lighthouse-core/computed/metrics/lantern-cumulative-long-queuing-delay.js

+      });
+    }
+
+    return events.sort((a, b) => a.start - b.start);


it makes some sense to always be sorted, but if it really is a perf concern, I don't think anything depends on it being that way?

Ah you're right - this sorting is not necessary. I remember I was doing the clipping in a different way initially which relied on sorting, but not anymore, and I forgot to change this. Removed the sorting.

deepanjanroy · 2019-06-04T22:10:27Z

Addressed comments. Do I need to do anything about the cla/google error?

brendankenny · 2019-06-05T00:07:52Z

Do I need to do anything about the cla/google error?

No, that's an annoying thing where it doesn't always recognize our changes through the github "suggestion" UI as coming from us, even though the author and committer are you and the Co-Authored-By is someone with a cla...

Still LGTM!

deepanjanroy requested review from brendankenny, patrickhulce and paulirish as code owners May 16, 2019 21:09

vercel bot deployed to staging May 16, 2019 21:09 View deployment

deepanjanroy commented May 16, 2019

View reviewed changes

lighthouse-core/test/results/sample_v2.json Show resolved Hide resolved

deepanjanroy changed the title ~~Implement TTI Companion Metric~~ new_audit: add TTI companion Metric to JSON May 16, 2019

deepanjanroy changed the title ~~new_audit: add TTI companion Metric to JSON~~ new_audit: add TTI Companion Metric to JSON May 16, 2019

vercel bot deployed to staging May 16, 2019 21:38 View deployment

patrickhulce reviewed May 16, 2019

View reviewed changes

vercel bot temporarily deployed to staging May 16, 2019 22:21 Inactive

vercel bot temporarily deployed to staging May 16, 2019 22:22 Inactive

vercel bot deployed to staging May 16, 2019 22:22 View deployment

GoogleChrome deleted a comment from googlebot May 20, 2019

vercel bot deployed to staging May 21, 2019 19:02 View deployment

vercel bot deployed to staging May 21, 2019 19:23 View deployment

vercel bot deployed to staging May 21, 2019 19:54 View deployment

vercel bot deployed to staging May 21, 2019 20:10 View deployment

exterkamp reviewed May 24, 2019

View reviewed changes

patrickhulce approved these changes May 24, 2019

View reviewed changes

vercel bot deployed to staging May 27, 2019 14:20 View deployment

vercel bot deployed to staging May 28, 2019 14:10 View deployment

Implement CLQD

a4f5187

deepanjanroy and others added 12 commits May 28, 2019 10:33

Remove metric group

6a0658e

Update lighthouse-core/computed/metrics/cumulative-long-queuing-delay.js

571e74d

Co-Authored-By: Patrick Hulce <[email protected]>

Update lighthouse-core/computed/metrics/lantern-cumulative-long-queui…

2f4aea0

…ng-delay.js Co-Authored-By: Patrick Hulce <[email protected]>

Update lighthouse-core/computed/metrics/lantern-cumulative-long-queui…

51e364f

…ng-delay.js Co-Authored-By: Patrick Hulce <[email protected]>

Update lighthouse-core/test/audits/metrics/cumulative-long-queuing-de…

15abc3c

…lay-test.js Co-Authored-By: Patrick Hulce <[email protected]>

Update lighthouse-core/test/computed/metrics/cumulative-long-queuing-…

b12f833

…delay-test.js Co-Authored-By: Patrick Hulce <[email protected]>

WIP

159b147

lint errors + update sample json

131617d

Update wording

4797fd9

Fix proto test

4af106b

Address all review comments

a122352

Rebase and add .js

516d06d

deepanjanroy force-pushed the comp-metric branch from 5aec9c7 to 516d06d Compare May 28, 2019 14:39

vercel bot deployed to staging May 28, 2019 14:39 View deployment

deepanjanroy commented May 28, 2019

View reviewed changes

googlebot added the cla: no label May 29, 2019

brendankenny approved these changes Jun 4, 2019

View reviewed changes

Address comments by brendankenny

ebbc706

vercel bot deployed to staging June 4, 2019 22:07 View deployment

brendankenny merged commit 952ae02 into GoogleChrome:master Jun 5, 2019

patrickhulce mentioned this pull request Jun 5, 2019

misc: update tracing-processor require #9123

Merged

brendankenny mentioned this pull request Oct 2, 2019

[meta] Lighthouse 6.0 Burndown #9774

Closed

54 tasks

snyk-bot mentioned this pull request Mar 21, 2020

[Snyk] Upgrade lighthouse from 5.1.0 to 5.6.0 godaddy/lighthouse4u#13

Merged

new_audit: add TTI Companion Metric to JSON #8975

new_audit: add TTI Companion Metric to JSON #8975

Conversation

deepanjanroy commented May 16, 2019

deepanjanroy commented May 16, 2019

patrickhulce commented May 16, 2019

deepanjanroy commented May 16, 2019

patrickhulce left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

connorjclark commented May 16, 2019

googlebot commented May 21, 2019

deepanjanroy commented May 21, 2019

deepanjanroy commented May 21, 2019

connorjclark commented May 21, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

patrickhulce left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deepanjanroy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deepanjanroy commented May 29, 2019

patrickhulce commented May 29, 2019

brendankenny left a comment • edited Loading

Choose a reason for hiding this comment

brendankenny Jun 4, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deepanjanroy commented Jun 4, 2019

brendankenny commented Jun 5, 2019

brendankenny left a comment •

edited

Loading

brendankenny Jun 4, 2019 •

edited

Loading