Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add note about custom metadata and spam prevention #85

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions index.src.html
Original file line number Diff line number Diff line change
Expand Up @@ -1146,6 +1146,48 @@ <h3 id="gc">Garbage Collection</h3>
removed from the <a>report buffer</a> of any <a>reporting observer</a>.
</section>

<section>
<h2 id="deployment">Deployment Considerations</h2>

<h3 id="custom-metadata">Custom Metadata</h3>

A server might want to include additional metadata in reports that are
generated for their origin. This can be accomplished by encoding the extra
metadata in the <a for="ReportTo">`url`</a> of any <a
for="ReportTo">`endpoints`</a> in the <a>`Report-To`</a> response headers
for the origin — for example, in the URL path or query parameters.

<pre>
<a>Report-To</a>: { "<a for="ReportTo">group</a>": "csp",
"<a for="ReportTo">max-age</a>": 10886400,
"<a>endpoints</a>": [
{ "<a for="ReportTo">url</a>": "https://example.com/reports?nonce=e897932f" }
] }
</pre>

Since the instructions in a <a>`Report-To`</a> header will be used for future
requests to the same origin, the server SHOULD NOT use this mechanism to
encode metadata that is only valid for the current request. The metadata MUST
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't enforce the MUST here, we can only provide warnings and recommendations. On that note, it may be worth considering moving this entire section into a non-normative appendix?

valid for all requests to the same origin from the same user.

Same user may or may not be relevant here.. For example, if you want to communicate deployed version, then that has nothing to do with the user. Also, it's possible that the reported metadata will be out of date, since the report could be triggered due to a failed navigation without seeing the most up to date report-to information.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it's possible that the reported metadata will be out of date

Right, that's what I was getting at with the user part — if the metadata is valid for all requests made by that user, then even if they use an old header to report on a failed request, the metadata is still valid.

I like your out-of-date wording, better, though — describe the facts, let the reader / server owner decide how that affects their requirements.

If you want to encode something in the upload URL, though, you have to make sure that you deploy the same version to the same user for all

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, as a thought experiment.. Instead of talking about "same version", perhaps focus on "last seen reporting endpoint wins" and you should take that into account — e.g. this is not the right mechanism to execute a/b tests, or similar.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that out-of-date URLs can be used to report network failures (as the more recent Report-To never made it to the browser) means that it'd be very hard to avoid replay attacks when aggregating reports on the server.
If we are limiting reports to the original IP on which the Report-To response was received, it raises the bar here (e.g. attacker would need to be able to reliably spoof that IP), but only if the server e.g. deletes reports which responses got RST.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it some more, combining per-IP nonces with per-IP rate-limiting (so that a certain IP can't report more than what a reasonable user would) can probably overcome replays, or at least make them less impactful.

be valid for all requests to the same origin from the same user.

<h3 id="spam-mitigation">Spam Mitigation</h3>

One potential use of [[#custom-metadata]] is to help prevent spam — report
uploads that don't correspond to a real request made by a real user. For
instance, when constructing the <a>`Report-To`</a> for a response, the server
could create a nonce whose value depends on the origin of the request, and the
public IP address of the client. The server would then embed this nonce into
the <a for="ReportTo">`url`</a> values of the header.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when I migrate to a different network and a report is triggered? E.g. I switch from coffee shop wifi to my phone, or move from work to home wifi?

Copy link
Member Author

@dcreager dcreager May 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chrome will throw those reports away, on the assumption that the user might have made requests on one network that they wouldn't have made on another, and doesn't want that information to leak across the network boundary.

If it didn't do that, though, the server would throw away the reports that were uploaded from a different IP than was used to make the original request. You can leave that out of the nonce calculation, but then a spammer could upload false reports from a huge number of clients using a single upload URL.

Can add some text about there being a trade-off — if you include more fields in the nonce, you're better protected against false uploads, but you might also miss out on some legitimate ones; and vice versa.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this has interesting implications..

a) Presumably the network switch logic wouldn't apply to all report types? For example, it's not clear to me why we'd clear deprecation reports on a network switch.

b) Drop on network switch has interesting implications for NEL.. This means that we're blind to flaky networks or networks that block user traffic for some reason. That said, this is a discussion for NEL so we can defer that here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a) Presumably the network switch logic wouldn't apply to all report types? For example, it's not clear to me why we'd clear deprecation reports on a network switch.

That would add to the boilerplate when defining a new event type. (I'm not against it, though — just walking through the ramifications!) Right now you'd have:

  • format of the report body
  • observable from JavaScript?
  • clear cache on network switch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would add to the boilerplate when defining a new event type.

Do you mean specification boilerplate or user-visible boilerplate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. In that case, IMO that's fine


When the collector receives a report, it will have access to the nonce (since
that will be part of the URL in the `POST` request to the collector). It can
construct a nonce for each report in the upload, using the origin of the
report's [=report/url=] and the IP address of the uploading client. If any of
the per-report nonces don't match the nonce in the upload URL, the
corresponding reports can be considered fraudulent, and dropped.

</section>

<section>
<h2 id="sample-reports">Sample Reports</h2>

Expand Down