Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core(source-maps): workaround CORS for fetching maps #9459

Merged
merged 23 commits into from
Mar 17, 2020
Merged

Conversation

connorjclark
Copy link
Collaborator

@connorjclark connorjclark commented Jul 25, 2019

This patch adds an interface over the Fetch domain. See comments for hows and whys.

Deferring unit tests until after feedback. I added a simple smoke test to confirm that CORS was vanquished. Not certain on the cookie stuff, definitely give that special attention.

cont. #9101
related: #9097

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really clever workaround I like it! still a bit unsure about needing to start with cookies at all since I think it potentially opens the door to having Lighthouse as a auto-CSRF tool 😆but OK

does devtools fetch source maps with cookies?


<body>
<script>
console.log('a');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the second one need to have content?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neither do - I didn't meant to commit this.

}

const responseBody = await this.sendCommand('Fetch.getResponseBody', {requestId});
if (responseBody.base64Encoded) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this just based on if chrome thinks the resource is text-based or not?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, IDK how the agent decides but there is no way to configure it from the protocol.


const responseBody = await this.sendCommand('Fetch.getResponseBody', {requestId});
if (responseBody.base64Encoded) {
resolve(Buffer.from(responseBody.body, 'base64').toString());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels like a tricky way to handle buffers, could we return a complex type {buffer: Buffer}|{text: string} or something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe return both cases as buffer?

instead of the string case: Buffer.from(bufStr, 'utf8');

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we currently fetch the robots.txt, sourcemaps, the start_url (HTML)

so we dont need non-text resources at all right now. fail any base64Encoded and we'll handle it later if we need to actually get these payloads

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm


// The first requestPaused event is for the request stage. Continue it.
if (!responseStatusCode) {
// Remove same-site cookies so we aren't buying stuff on Amazon.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we're not concerned with the sort of manifest/user profile state being stored in a samesite cookie that made us want to deal with cookies at all in the first place?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idk how cookies work tbh. so you're saying this bit of code totally invalidates any source maps behind cookie-authentication?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any cookie auth that they've marked as SameSite, yeah

marking auth cookies as samesite is a decent way to prevent the annoying site-style logout/language attack

lighthouse-core/gather/driver.js Outdated Show resolved Hide resolved
lighthouse-core/gather/driver.js Outdated Show resolved Hide resolved
Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this should wait at least until there's an audit using source maps and we can see what the actual benefits would be. This isn't blocking anything in #9097 so it would be nice to have a much better picture of the tradeoffs involved before committing to this approach.

lighthouse-core/gather/driver.js Outdated Show resolved Hide resolved
if (!responseStatusCode) {
// Remove same-site cookies so we aren't buying stuff on Amazon.
const sameSiteCookies = await this.sendCommand('Network.getCookies', {urls: [url]});
const sameSiteCookiesKeyValueSet = new Set();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could split into its own fn for easier testing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

punting on testing b/c i don't really know how the cookie story plays out here, may be this is just removed and we totally delete all cookies?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to keep it simple and safe at first I would propose deleting all cookies, until we observe the need for them


const responseBody = await this.sendCommand('Fetch.getResponseBody', {requestId});
if (responseBody.base64Encoded) {
resolve(Buffer.from(responseBody.body, 'base64').toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we currently fetch the robots.txt, sourcemaps, the start_url (HTML)

so we dont need non-text resources at all right now. fail any base64Encoded and we'll handle it later if we need to actually get these payloads

/** @type {HTMLIFrameElement} */
const iframe = document.createElement('iframe');
// Try really hard not to affect the page.
iframe.style.display = 'none';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think you also want to set top.

 iframe.style.cssText = `
display: none;
visibility: hidden;
position: absolute;
left: -100px;
top: -100px;
width: 1px;
height: 1px;
`;

lighthouse-core/gather/driver.js Outdated Show resolved Hide resolved
resolve(responseBody.body);
}

// Fail the request (from the page's perspective) so that the iframe never loads.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you remind me (perhaps in a comment) why we do this?

Copy link
Collaborator Author

@connorjclark connorjclark Jul 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't have any more to add than what the comment says. I assume Blink will do more work if the iframe actually gets its contents loaded, even if all the styling on it makes it not visible. but i'm just guessing.

}

// Fail the request (from the page's perspective) so that the iframe never loads.
this.sendCommand('Fetch.failRequest', {requestId, errorReason: 'Aborted'});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aborted by Lighthouse?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an enum, can't give custom messages.

* 3) if multiple commands to continue the same request are sent, protocol errors occur.
*
* So instead we have one global `Fetch.enable` / `Fetch.requestPaused` pair, and allow specific
* urls to be intercepted via `driver.setOnRequestPausedHandler`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* urls to be intercepted via `driver.setOnRequestPausedHandler`.
* urls to be intercepted via `fetcher.setOnRequestPausedHandler`.

* @param {string} url
* @param {(event: LH.Crdp.Fetch.RequestPausedEvent) => void} handler
*/
async setOnRequestPausedHandler(url, handler) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so...... private? :)

* @return {Promise<string>}
*/
async fetchResource(url, timeoutInMs = 500) {
if (!this.driver.isDomainEnabled('Fetch')) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also if we are going to be replacing this with a simple "Page.fetchTheThing" protocol method soon, then the whole enabling/disabling domain thing from outside is going to need to go away so I definitely vote to keep that internal to this class if possible :)

lighthouse-core/gather/fetcher.js Outdated Show resolved Hide resolved
.mockResponse('Debugger.disable', {});
.mockResponse('Debugger.disable', {})
.mockResponse('Fetch.enable', {})
.mockResponse('Fetch.disable', {});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplifies all this testing too :)

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

lighthouse-core/gather/gather-runner.js Show resolved Hide resolved
@brendankenny
Copy link
Member

are we still pausing on this until we get a sense of the protocol method timeline/feature set?

@patrickhulce
Copy link
Collaborator

are we still pausing on this until we get a sense of the protocol method timeline/feature set?

I don't think we were ever pausing, just writing in a way that keeps the transition smooth :)

@brendankenny
Copy link
Member

are we still pausing on this until we get a sense of the protocol method timeline/feature set?

I don't think we were ever pausing, just writing in a way that keeps the transition smooth :)

sorry, this was from a side conversation where (I believe) we agreed to evaluate the viability of the protocol method first

@connorjclark
Copy link
Collaborator Author

i kicked off an email thread about this. because it would be cool.
in the meantime though, lets not block this improvement. in the future i hope we can improve the Fetcher even more.
#9459 (comment)

then somewhere else I think @paulirish reached out to @sigurdschneider and learned things are still very WIP.

Looking forward to a protocol solution but don't wanna block this. It's unknown when protocol support will land.

@connorjclark connorjclark merged commit 72f0774 into master Mar 17, 2020
@connorjclark connorjclark deleted the fetch-no-cors branch March 17, 2020 23:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants