-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add note about custom metadata and spam prevention #85
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1146,6 +1146,48 @@ <h3 id="gc">Garbage Collection</h3> | |
removed from the <a>report buffer</a> of any <a>reporting observer</a>. | ||
</section> | ||
|
||
<section> | ||
<h2 id="deployment">Deployment Considerations</h2> | ||
|
||
<h3 id="custom-metadata">Custom Metadata</h3> | ||
|
||
A server might want to include additional metadata in reports that are | ||
generated for their origin. This can be accomplished by encoding the extra | ||
metadata in the <a for="ReportTo">`url`</a> of any <a | ||
for="ReportTo">`endpoints`</a> in the <a>`Report-To`</a> response headers | ||
for the origin — for example, in the URL path or query parameters. | ||
|
||
<pre> | ||
<a>Report-To</a>: { "<a for="ReportTo">group</a>": "csp", | ||
"<a for="ReportTo">max-age</a>": 10886400, | ||
"<a>endpoints</a>": [ | ||
{ "<a for="ReportTo">url</a>": "https://example.com/reports?nonce=e897932f" } | ||
] } | ||
</pre> | ||
|
||
Since the instructions in a <a>`Report-To`</a> header will be used for future | ||
requests to the same origin, the server SHOULD NOT use this mechanism to | ||
encode metadata that is only valid for the current request. The metadata MUST | ||
be valid for all requests to the same origin from the same user. | ||
|
||
<h3 id="spam-mitigation">Spam Mitigation</h3> | ||
|
||
One potential use of [[#custom-metadata]] is to help prevent spam — report | ||
uploads that don't correspond to a real request made by a real user. For | ||
instance, when constructing the <a>`Report-To`</a> for a response, the server | ||
could create a nonce whose value depends on the origin of the request, and the | ||
public IP address of the client. The server would then embed this nonce into | ||
the <a for="ReportTo">`url`</a> values of the header. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What happens when I migrate to a different network and a report is triggered? E.g. I switch from coffee shop wifi to my phone, or move from work to home wifi? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Chrome will throw those reports away, on the assumption that the user might have made requests on one network that they wouldn't have made on another, and doesn't want that information to leak across the network boundary. If it didn't do that, though, the server would throw away the reports that were uploaded from a different IP than was used to make the original request. You can leave that out of the nonce calculation, but then a spammer could upload false reports from a huge number of clients using a single upload URL. Can add some text about there being a trade-off — if you include more fields in the nonce, you're better protected against false uploads, but you might also miss out on some legitimate ones; and vice versa. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, this has interesting implications.. a) Presumably the network switch logic wouldn't apply to all report types? For example, it's not clear to me why we'd clear deprecation reports on a network switch. b) Drop on network switch has interesting implications for NEL.. This means that we're blind to flaky networks or networks that block user traffic for some reason. That said, this is a discussion for NEL so we can defer that here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That would add to the boilerplate when defining a new event type. (I'm not against it, though — just walking through the ramifications!) Right now you'd have:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Do you mean specification boilerplate or user-visible boilerplate? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Spec There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK. In that case, IMO that's fine |
||
|
||
When the collector receives a report, it will have access to the nonce (since | ||
that will be part of the URL in the `POST` request to the collector). It can | ||
construct a nonce for each report in the upload, using the origin of the | ||
report's [=report/url=] and the IP address of the uploading client. If any of | ||
the per-report nonces don't match the nonce in the upload URL, the | ||
corresponding reports can be considered fraudulent, and dropped. | ||
|
||
</section> | ||
|
||
<section> | ||
<h2 id="sample-reports">Sample Reports</h2> | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't enforce the
MUST
here, we can only provide warnings and recommendations. On that note, it may be worth considering moving this entire section into a non-normative appendix?Same user may or may not be relevant here.. For example, if you want to communicate deployed version, then that has nothing to do with the user. Also, it's possible that the reported metadata will be out of date, since the report could be triggered due to a failed navigation without seeing the most up to date report-to information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that's what I was getting at with the user part — if the metadata is valid for all requests made by that user, then even if they use an old header to report on a failed request, the metadata is still valid.
I like your out-of-date wording, better, though — describe the facts, let the reader / server owner decide how that affects their requirements.
If you want to encode something in the upload URL, though, you have to make sure that you deploy the same version to the same user for all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, as a thought experiment.. Instead of talking about "same version", perhaps focus on "last seen reporting endpoint wins" and you should take that into account — e.g. this is not the right mechanism to execute a/b tests, or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that out-of-date URLs can be used to report network failures (as the more recent
Report-To
never made it to the browser) means that it'd be very hard to avoid replay attacks when aggregating reports on the server.If we are limiting reports to the original IP on which the
Report-To
response was received, it raises the bar here (e.g. attacker would need to be able to reliably spoof that IP), but only if the server e.g. deletes reports which responses got RST.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it some more, combining per-IP nonces with per-IP rate-limiting (so that a certain IP can't report more than what a reasonable user would) can probably overcome replays, or at least make them less impactful.