Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core(driver): security errors are no longer a fatal or pageload error #8865

Merged
merged 3 commits into from
Jun 3, 2019

Conversation

paulirish
Copy link
Member

@paulirish paulirish commented May 5, 2019

fix #6655, #7292
closes #7574

Following the investigation and proposal outlined here: #6655 (comment)

I've tested with every badssl page and we never hang indefinitely on any of them. I did see a few PROTOCOL_TIMEOUT on the offlinePass though.

Also every badssl case where a security interstitial is shown hits our FAILED _DOC_REQ case and the localizedFailDescription that we show in the warning captures the same error code we got via securityState.insecureDescriptions. \o/

see also #8340

patrickhulce
patrickhulce previously approved these changes May 6, 2019
Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that's awesome news!

LGTM

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the change, but not sure about the very end result. Does it align with "we shouldn't throw if we can technically finish loading the page" since we can't technically load the page? It feels like there should be non-zero exit code for this one, otherwise it's going to be very easy to miss in any programmatic environment.

@@ -18,10 +18,10 @@ module.exports = [
},
},
{
errorCode: 'INSECURE_DOCUMENT_REQUEST',
errorCode: undefined,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should have other expectations then? And maybe a comment that this is ensuring there isn't an error (since it's in the error-expectations file :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed by followup to #8340, the comments I have there address the fact that none of these are currently fatal errors but IMO should be non-zero exit codes.

@patrickhulce
Copy link
Collaborator

What's our opinion on this one are we nuking it entirely like this PR does?

Or are we just going to move it to be non-fatal with a warning like followup to #8340 would?

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, this seems workable, but it has a few drawbacks currently:
Screen Shot 2019-05-28 at 18 10 07

  • this definitely yells that something went wrong, but it's just kind of impressively blown up instead of communicating well what happened. Consistent style for lh-warnings #7011 will help, but is it strange to be showing all this content when none of it is meaningful?
  • a whole bunch of logging from work that didn't need to be done (all the audits)
  • as @patrickhulce mentioned, a little unfortunate that most of the error messages are referencing the gatherer (Required TagsBlockingFirstPaint gatherer encountered an error: FAILED_DOCUMENT_REQUEST), but at least they get a good error code in there
  • also weird that we still take a trace and a devtools log for a page that isn't the page being targeted (this is why all the metrics fail with Something went wrong with recording the trace over your page load. Please run Lighthouse again. (NO_NAVSTART), since their artifacts (traces and devtoolsLogs) do exist)
  • as a result of how we currently put together runtimeError from the artifacts, NO_NAVSTART doesn't bubble up as a runtime error (even though it's marked as one), so e.g. the current badssl error smoke test doesn't get a runtimeError or an error exit code. That definitely shouldn't depend on what audits are being run.

Maybe this would work better if we added something more specific than the default "no artifact" failure mode for audits in this case

@brendankenny
Copy link
Member

(I can help come up with actual solutions--not just a list of problems--tomorrow, I just had to leave for the day :)

@patrickhulce
Copy link
Collaborator

patrickhulce commented May 30, 2019

My proposal for path forward:

  1. Land this as-is.
  2. Follow-up PR to make all runtimeErrors exit with a non-zero exit code as discussed here and in core(gather-runner): treat NO_FCP as a pageLoadError #8340. This should not impact any environment other than CLI. Other consumers should already be deciding whether to consume the LHR or not based on the existence of runtimeError.
  3. Land a modified version of core(navigation): Add an option to ignore https errors during navigation #7574 to actually make HTTPS pages with bad certs loadable. This PR as-is doesn't fix the core(navigation): Add an option to ignore https errors during navigation #7574 concern because the pages still fail to load.
  4. Land a rebased version of core: bail on gathering if we have a failure in the first pass #8866.
  5. Land a version of core: bail on gathering if we have a failure in the first pass #8866 but for audits that fails them out with a specifc runtimeError error and skips the unnecessary logging.

Specific responses and rationale to @brendankenny's great points below:

but is it strange to be showing all this content when none of it is meaningful?

IMO, yes it's super strange. But that's the paradigm we've settled on elsewhere and I can understand the rationale for certain types of integrations that want some of the information we collect.

a whole bunch of logging from work that didn't need to be done

Fixed by step 5.

also weird that we still take a trace and a devtools log for a page that isn't the page being targeted

As I understand it this is actually the primary desire and reason for providing this mostly useless LHR and returning the runtimeError instead of throwing. So, feature not bug ;)

as a result of how we currently put together runtimeError from the artifacts, NO_NAVSTART doesn't bubble up as a runtime error (even though it's marked as one), so e.g. the current badssl error smoke test doesn't get a runtimeError or an error exit code. That definitely shouldn't depend on what audits are being run.

I think this is fixed by the combination of steps 2, 5, and 6.

@brendankenny
Copy link
Member

Follow-up PR to make all runtimeErrors exit with a non-zero exit code as discussed here and in #8340. This should not impact any environment other than CLI. Other consumers should already be deciding whether to consume the LHR or not based on the existence of runtimeError.

I think with #8340 this is already done? (for all instances of runtimeError that actually become an lhr.runtimeError)

Land a modified version of #7574 to actually make HTTPS pages with bad certs loadable. This PR as-is doesn't fix the #7574 concern because the pages still fail to load.

with this PR landed, isn't --chrome-flags="--ignore-certificate-errors" sufficient? (see discussion from #6655 (comment) onwards)

steps 4 and 5 sound good

but is it strange to be showing all this content when none of it is meaningful?

IMO, yes it's super strange. But that's the paradigm we've settled on elsewhere and I can understand the rationale for certain types of integrations that want some of the information we collect.

well the strangeness feels bad :) https://googlechrome.github.io/lighthouse/viewer/?gist=b8c85fa12128c68f017e5074c4e50e2d doesn't seem like something we should be returning. lhr.runtimeError and the exit code are good to go in that case, but everything else is at best useless, at worst confusing.

Maybe we need the Lighthouse version of a server error page, a design reminiscent of the main report but not trying to render a bunch of stuff that makes no sense.

also weird that we still take a trace and a devtools log for a page that isn't the page being targeted

As I understand it this is actually the primary desire and reason for providing this mostly useless LHR and returning the runtimeError instead of throwing. So, feature not bug ;)

I agree in the case that --chrome-flags="--ignore-certificate-errors" is passed in, they get a net::ERR_CERT_DATE_INVALID error but since we can still load the page and test it we should. But without the flag, a trace and devtoolsLog for a page (the security interstitial) that's different than the page they were trying to test seems like a bug not a feature :)

It seems like @paulirish's "we shouldn't throw if we can technically finish loading the page" shouldn't apply if we're loading something but it isn't "the page".

So, suggested addenda to suggested plans:

  • step 5 += a Chrome interstitial detector that leads to a runtime error
  • removing some base artifacts in the face of an interstitial (e.g. trace and devtoolsLog of the interstitial)
  • a new error-state report. Maybe it keeps the ? ? ? ? ? header and Consistent style for lh-warnings #7011-ified warning but doesn't render categories or something

@patrickhulce
Copy link
Collaborator

patrickhulce commented May 30, 2019

I think with #8340 this is already done?

Almost but we're not quite there yet. I called it out in the PR description in #8340 but PAGE_HUNG and INSECURE_DOCUMENT_REQUEST both don't abide by this because of https://github.com/GoogleChrome/lighthouse/pull/8340/files#diff-42278a7ee772120215ea8cfe1b0cb1b1R74. INSECURE_DOCUMENT_REQUEST will take care of itself by landing this, so that just leaves PAGE_HUNG

with this PR landed, isn't --chrome-flags="--ignore-certificate-errors" sufficient?

Well depends on what you mean by sufficient, it'll still be subject to all the other problems you've called out and seems like a LH flag to do both for you sounds like a good idea :)

everything else is at best useless

The point as I understand it is that an integration can still take a look at the artifacts we managed to collect to get screenshots/request specifics and display a better debug screen. Maybe we should be building such a screen based on artifacts too, is part of your argument here?

step 5 += a Chrome interstitial detector that leads to a runtime error

👍

removing some base artifacts in the face of an interstitial (e.g. trace and devtoolsLog of the interstitial)

As I understand it, this seems like it destroys the only value we have in not throwing. Why don't we want to make this information available for debugging?

a new error-state report. Maybe it keeps the ? ? ? ? ? header and #7011-ified warning but doesn't render categories or something

👍

@brendankenny
Copy link
Member

with this PR landed, isn't --chrome-flags="--ignore-certificate-errors" sufficient?

with this PR

lighthouse https://expired.badssl.com --view --chrome-flags="--ignore-certificate-errors"

really does work fine since Chrome can proceed to load the underlying page:

expired-badssl Lighthouse report

https://googlechrome.github.io/lighthouse/viewer/?gist=6797a0f98ab9c1a9f3f86abad2bea41a

I personally think we should have an audit that calls out the bad certificate even while everything else is green, but everyone else seems to think everything should be all green in this case since the user explicitly opted in to it, so the --ignore-certificate-errors case may be good to go after this lands.

@patrickhulce
Copy link
Collaborator

patrickhulce commented Jun 1, 2019

Wait I thought you were just saying it was the other screenshot and that was the main problem! Oh yes yes, still seems like a flag might be useful at some point in the future but not necessary anymore. 👍

@Janpot
Copy link
Contributor

Janpot commented Jun 1, 2019

I personally think we should have an audit that calls out the bad certificate even while everything else is green, but everyone else seems to think everything should be all green in this case since the user explicitly opted in to it

IMO It makes sense to make this opt in. And when I do opt in, I'd expect lighthouse to behave very similar when a website has no certificate vs. a bad certificate. I opt in, not because I don't care about security, but because a bad certificate shouldn't block me from auditing every other aspect of my website's performance. If there's an audit that calls me out for not having https, then it should also call me out for having broken https.

@patrickhulce
Copy link
Collaborator

So where do we stand with this then. @brendankenny do you have specific requests for this PR or are all of your concerns with points other than number 1 in the plan?

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should land this, we just have some rough edges to clean up after it lands

@cjolif
Copy link

cjolif commented Jul 24, 2019

any idea on when a release will that fix will be made?

@connorjclark
Copy link
Collaborator

#9442 release is imminent

@cjolif
Copy link

cjolif commented Jul 24, 2019

Thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Don't fail localhost pages with INSECURE
7 participants