Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core(gather-runner): don't save trace on pass with pageLoadError #9198

Merged
merged 4 commits into from
Jun 19, 2019

Conversation

brendankenny
Copy link
Member

@brendankenny brendankenny commented Jun 13, 2019

part of #8865 (comment)

#9176 turned a load that ends up on a Chrome interstitial (e.g. INSECURE_DOCUMENT_REQUEST) into a runtime error to let the user know that the page they were testing had a major problem and was never actually loaded for the test. Currently, however, all the artifacts in that errored pass are returned as errors except the devtoolsLog and trace, which end up in a kind of weird state where they're actually tracing the load of the interstitial, not any useful page.

If the load ends up just going to the interstitial, the metrics all end up as NO_NAVSTART errors, but if there was somehow a navigation (e.g. a URL redirects to a page with a bad cert), you'll end up with some nice metrics for the interstitial page.

Example report:

report with valid metrics for error interstitial

https://googlechrome.github.io/lighthouse/viewer/?gist=91d3c00c1186fdac9cc64f348dd14274

Regardless of seeing an audit error, in a pass with a pageLoadError, we shouldn't save the trace or devtoolsLog, just like we don't save the rest of the artifacts in that pass.

After this PR:

report with errored metrics for error interstitial

(also adds a test for the if-pageLoadError-stop-reloading behavior added in #8866/#9121)

@patrickhulce
Copy link
Collaborator

So I understand doing this from the perspective of the audit-visible artifacts and agree we prefer this report outcome, but from what I understand this eliminates the only current benefit of not having a fatal error, i.e. it's helpful right now when there's a runtimeError to be able to inspect the devtools log and trace for screenshots and network debugging data.

WDYT about trying to make the trace and devtools log still pass through to artifact saver/node result object?

@brendankenny
Copy link
Member Author

Another benefit of all these changes is if there's a fatal error in the second or third pass (like http redirects to a bad cert url), there are still results from whatever was gathered in the earlier passes.

WDYT about trying to make the trace and devtools log still pass through to artifact saver/node result object?

Do you mean like just for debugging? Or would there be other uses you foresee?

Maybe we could mark them somehow as "here this is, just know it's a trace of a bad state and you'll have to manually inspect it if you want something out of it"? But if you mean it should also be in the node lhr that's somewhat different...

@brendankenny
Copy link
Member Author

This also makes me wonder if pageloaderror always ending things is granular enough... An interstitial means the trace isn't even of your content so it's useless, but with PAGE_HUNG the trace is of your page, so some metrics might be extractable.

@patrickhulce
Copy link
Collaborator

Do you mean like just for debugging? Or would there be other uses you foresee?

Yeah, debugging. If we're eliminating literally every useful bit of information we managed to collect from the run in our return object, then I don't really see a reason to use runtimeError at all and we should just go back to fatally throwing again.

Maybe we could mark them somehow

That's what the existence of .runtimeError signals already, no?

But if you mean it should also be in the node lhr that's somewhat different...

I'm not sure what you mean here. We never return trace and devtools logs in the LHR and I'm certainly not saying we should start, but we need the artifacts in the node result object (of which the .lhr is a property) in order to write them to disk in CLI and let node consumers save anything for debugging.

I guess I don't understand why it's important to make sure they dont't have access to them. Given that most of these runtime errors are "hey there's something effed up with your page" I think there's value in giving the more sophisticated developer the data necessary to debug what's going wrong.

@brendankenny
Copy link
Member Author

Yeah, debugging. If we're eliminating literally every useful bit of information we managed to collect from the run in our return object, then I don't really see a reason to use runtimeError at all and we should just go back to fatally throwing again.

Well we do have (possible) earlier passes that were ok and can return artifacts. This might become more important if we start opening up gathering options to plugins, which could add more passes, etc.

Maybe we could mark them somehow

That's what the existence of .runtimeError signals already, no?

yes, but there could be a good trace and a bad trace from two different passes, for instance, and it wouldn't be clear which one was the bad one from just having a runtimeError.

I guess I don't understand why it's important to make sure they don't have access to them

this came out of trying to define exactly what GatherRunner should be returning, so this is a good conversation to have :)

My thinking was basically:

  • no pageLoadError -> here's all the artifacts I collected about the page
  • pageLoadError -> here's all the known good artifacts I was able to collect about the page

we already give up any hope of collecting artifacts from a bad pass (and more recently we give up any hope of collecting artifacts from any later passes) so it made sense to me to discard the trace/devtoolsLog from a known bad pass as well.

It sounds like (correct me if I'm wrong) you're coming at it from the other way...we have this data, why not make it available (in a non-default way beyond asking folks to set a breakpoint) since there may be something valuable in there. I can definitely see the usefulness of that as well.

@patrickhulce
Copy link
Collaborator

good trace and a bad trace from two different passes

image

😉 To seriously address your plugin point though, I can see the value there. Perhaps we need a property in the pageLoadError that describes which pass has failed? Better yet, remove the .online check and annotate passes for "expected" failures?

we already give up any hope of collecting artifacts from a bad pass (and more recently we give up any hope of collecting artifacts from any later passes)

The distinction to me is that pretty much all of our gatherers are trying to learn something about the page after it's been loaded while trace and devtools logs are observations of whatever happened while we were trying to load it. When the page fails to load, there's no sense in trying to learn about the fake/failed page but observations of whatever led to the failure are quite valuable.

@brendankenny
Copy link
Member Author

I think I agree with all of the above, and

Better yet, remove the .online check and annotate passes for "expected" failures?

is a really good idea.

What do we want Artifacts to look like, though? I still think we shouldn't be mixing known-good and known-bad traces in a single object (however theoretical this situation still is :).

If the only use case we have for keeping bad traces around is for debugging, we shouldn't make the Artifacts interface more complicated (e.g. checking a bad bit or an LHR field saying which pass(es) failed before accessing a trace) for the usual case to accommodate the (very) unusual case.

We could do something stupidly simple like renaming the trace/devtoolsLog property name from traces[passName] to something like traces[`bad-${passName}`]. Medium-awful solution :) but:

  • doesn't require any changes to existing runner/audits code but traces of errors won't be audited anymore
  • doesn't require a new top-level artifact
  • bad traces will still end up in the module/cli LHR and saved/loaded as normal with no changes to existing code

(or we could just drop them and require someone debugging to set a breakpoint but I could live with the above :)

@patrickhulce
Copy link
Collaborator

traces[`pageLoadError-${passName}`] actually sounds great to me, the CLI asset-saver logic/experience is identical and it clearly separates them from normal artifacts. Huzzah! :)

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

would prefer some of these changes, but I'll live either way :)

@@ -216,8 +216,7 @@ module.exports = [
},
audits: {
'http-status-code': {
score: 0,
displayValue: '403',
score: null,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like this audit is useless at this point now, candidate for deletion?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like this audit is useless at this point now, candidate for deletion?

I'd say yes (are there cases when non 200 status codes shouldn't be a runtimeError?)

* @param {LH.Gatherer.LoadData} loadData
* @param {string} passName
*/
static _saveLoadData(passContext, loadData, passName) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save makes me think of like a disk write for some reason, WDYT about

addLoadDataToBaseArtifacts
stashLoadDataOnBaseArtifacts
saveLoadDataToBaseArtifacts
augmentBaseArtifactsWithLoadData

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_addLoadDataToBaseArtifacts 👍

// Loads the page successfully in the first pass, fails with NO_FCP in the second.
async gotoURL(url) {
if (url.includes('blank')) return null;
if (firstLoad) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you feel about jest.fn()?

IMO, it would be a bit easier to read our mock setups if we constructed them consistently with the creators. Right now we kinda have to figure out each situation on a case-by-case basis

jest.fn()
  .mockResolvedValueOnce(requestedUrl)
  .mockRejectedValueOnce(new LHError(LHError.errors.NO_FCP));

could go whole hog if we wanted

const gotoBlankFn = jest.fn().mockResolvedValue(null);
const gotoPageURLFn = jest.fn()
  .mockResolvedValueOnce(requestedUrl)
  .mockRejectedValueOnce(new LHError(LHError.errors.NO_FCP));
const gotoURL = url => url.includes('blank') ? gotoBlankFn() : gotoPageUrlFn();
const driver = Object.assign({}, fakeDriver, {gotoURL});

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you feel about jest.fn()?

I like it sometimes and have no problem with it, but in this case no matter what I try the readability comes out slightly worse (the url.includes('blank') really gums up the works). I think it reads ok for this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants