Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Desktop] "Block all fingerprinting" is a misleading label #10691

Closed
l0k9j opened this issue Jul 10, 2020 · 8 comments
Closed

[Desktop] "Block all fingerprinting" is a misleading label #10691

l0k9j opened this issue Jul 10, 2020 · 8 comments

Comments

@l0k9j
Copy link

l0k9j commented Jul 10, 2020

Description

The 'Block all fingerprinting' option in the Shields settings is very misleading as it doesn't not even block basic fingerprinting information such as OS version in the User Agent, non English Content language (or revealing combinations), Screen resolution, etc.

Steps to Reproduce

In Ubuntu, set 'Block all fingerprinting' in your settings, use french or another language dictionary (i.e. combination of multiple languages). Go to https://www.amiunique.org/fp and check your all-time similarity ratio.

Actual result:

Very low similarity ratios (< 1% and many < 0.05%) for things like User Agent, Fonts, Screen width, height, plugins, ...

Expected result:

If it claims to 'block all fingerprint' I'd expect individual similarity ratio to be higher.

Reproduces how often:

Easily, if you have the right environment. I'm using xfce, with a customised taskbar so this will affect the screen resolution, making them very unique.

Brave version (brave://version info)

Version 1.8.95 Chromium: 81.0.4044.138

Miscellaneous Information:

I am aware that those individual items are not unique and the shortcomings of sites like amiunique (e.g. sample not necessarily representative) but some of those items are very easy to exploit (e.g. screen resolution) and in combination they can become pretty accurate, especially on low-traffic sites.

I'm also aware that some extensions do give advanced protections (e.g. user agent). But most of them target only one or a few strategies and installing extensions is always a leap of faith.

See related #10651

Suggestion: rename the label to something like 'block some fingerprinting'.
Consider having an additional option 'aggressive fingerprint prevention' (or even a dedicated section in the settings) which would be much more radical about the mitigation methods (i.e. align exposed values to closest most popular values even at the cost of some bearable usability degradation; e.g. align screen resolution to nearest most popular width, height; only expose primary language, with a default q).

The privacy paradox here is that fingerprinting is made easier on configurations which have better reputation for privacy (e.g. Ubuntu vs Windows/Mac). So the more user try to escape it the more likely their Browser will expose those less mainstream choices.

Also keep in mind that not everyone lives in the US, speaks English or uses Mac or Windows. There's a significant long tail which is more vulnerable. It would be fantastic to see Brave showing the way as some of those items are easy to anonymise without much usability penalty for the user.

@rebron
Copy link
Collaborator

rebron commented Jul 10, 2020

Screen Shot 2020-07-10 at 10 03 36 AM
Closing. Labels have been updated in 1.11.x already which will be released next week. The text labels in Shields itself are more descriptive too.

@pes10k
Copy link
Contributor

pes10k commented Jul 10, 2020

@l0k9j
@rebron commented above on the label change. You might also be interested in how Brave is tackling the fingerprinting vectors you mentioned, and specifically our solution to the problem of "combining multiple identifiers into a single unique one". The TL;DR; answer is that for many of these things theres no solution that wont also break a bunch of things, so we instead try to make the combined fingerprint for the browser different for each website, for each session, to prevent cross-session, or cross-1p-tracking

https://brave.com/whats-brave-done-for-my-privacy-lately-episode-4-fingerprinting-defenses-2-0

@l0k9j
Copy link
Author

l0k9j commented Jul 10, 2020

Thanks @pes10k & @rebron for the clarifications. These are brilliant efforts and I look forward to seeing how the next version of Brave will perform. As explained above I'm especially interested to see how you'll handle 'basic' properties (like the OS in UA, accept-language header, or the screen resolution) which are effortless for sites to obtain but, on non-mainstream operating systems, can be quite discriminant. Properties which are present in the headers rather than actively extracted with JS calls/libraries are particularly sensitive as they are leaked more easily on web servers.

A good usability vs privacy trade-off will be difficult to reach as users have different tolerance thresholds.

@pes10k
Copy link
Contributor

pes10k commented Jul 10, 2020

We don't have any plans to make any changes to OS in UA, accept-language header, or the screen resolution. (with the exceptions of the UA changes mentioned in the blog post i linked to).

Changing these will break too many websites, or be to jarring for our users. The good news is that these features alone are rarely, if ever, used by fingerprinting scripts (since there is rarely enough identifying information in those end points to uniquely identify someone). Our approach is to add noise to the other values (web audio, canvas, webgl, plugins, enumerated devices, etc) that are combined with the three you mentioned. If a fingerprinting consumes even one of randomized value, then the fingerprinting will be unique-and-random (per page, per session) and provide the defenses you're looking for, w/o the web compat cost

@l0k9j
Copy link
Author

l0k9j commented Jul 10, 2020

@pes10k

I have "privacy.resistFingerprinting = false" in my Firefox instance, which returns a "Windows NT 10.0" OS in the User-Agent (instead of "X11; Linux x86_64") and I definitely don't see any breakage or anything jarring. Obviously some sites would suggest the wrong download file to me but that's even less disrupting than the "Strict, may break sites" category may suggest. However according to AmIUnique or panopticlick, it makes a significant difference in the similarity score.

Likewise both fingerprinting test sites indicate that the accept-language header is one of the most discriminant fingerprinting properties (1 in 130730 according to panopticlick and <0.01% for AmIUnique). In my case the language string is two order of magnitude more discriminant than the than the plugins or the fonts!

By the way I don't see how I can control the language headers from the Brave settings. I'm guessing they came from the spelling dictionary I picked in the past and got stuck even after reverting those changes.

I'm wondering how you benchmark and test your anti-fingerprinting measures? Do you test that against samples or usage stats sent by the Brave clients. I'd be interested to see how representative the test set it wrt to the diversity of configurations and environments you might find across the globe.

@pes10k
Copy link
Contributor

pes10k commented Jul 10, 2020

I have "privacy.resistFingerprinting = false" in my Firefox instance, which returns a "Windows NT 10.0" OS in the User-Agent (instead of "X11; Linux x86_64") and I definitely don't see any breakage or anything jarring

Guessing you mean privacy.resistFingerprinting = true? :)

Unfortunately, every time we've changed the UA its bitten us on long tail sites. That why, for example, we've rolled back our plans to put "Brave" in the UA. I expect this is why Firefox doesn't enable this by default. As described in that blog post, Brave by default will remove device information and reduce the granularity of minor version OS numbers in the UA, which will improve things / reduce identification somewhat. However, the "max fingerprinting" option will report a fixed UA for each system, which will do exactly what you're look for :). Again, the blog post (and the linked to issues there) have more details if you'd like them. (UA specifically: #9190)

accept-language

This one is a pickle, since changing this will break things in obvious ways if we're not extremely careful. If you don't mind, could I ask what your accept-language is currently?

But, the difficulty of addressing some of these identifying parts of the browser is why Brave is pursing our "fingerprinting protection through randomization" approach. For some users with uncommon configurations, or visiting relatively-unpopular websites, it simply is not possible to try and beat fingerprinting by trying to make all users look identical; sometimes sites really do need to know your language and screen resolution to work correctly, etc.

Our approach is to sneak as many intentionally randomized values into the kinds of values that fingerprinters hash together when generating the unique-tracking value. Since no single value is unique, generally fingerprinters hash together a bunch of values to generate the unique value they track you with. Fingerprint2 is a perfect example of this. Getting any random value in that hash will result in a completely unique fingerprint (per page, per session), which makes you untrackable.

In other words: there is just too much in the platform thats pretty unique for most people, and trying to make these APIs look similar for most people will break a lot of things (again, why FF does not enable that option by default). We try to get as many "poison pills" into the platform as possible, so that if a fingerprinter uses even one of them in their fingerprint, you're protected, in a way much more protecting than trying to force values into bins.

By the way I don't see how I can control the language headers from the Brave settings. I'm guessing they came from the spelling dictionary I picked in the past and got stuck even after reverting those changes.

I'm not sure how this is determined either, though I thought it was inherited from the OS. If that doesn't match what you're seeing, let me know and i can open a bug and make sure its sorted out.

I'm wondering how you benchmark and test your anti-fingerprinting measures? Do you test that against samples or usage stats sent by the Brave clients. I'd be interested to see how representative the test set it wrt to the diversity of configurations and environments you might find across the globe.

Because we don't store any information about our users, we don't have this kind of information. We generate our fingerprint defenses in a couple of ways.

  1. pulling from academic research (see the papers linked to from the blog post)
  2. existing datasets like the ones you pull from (though those have significant problems too)
  3. familiarity with, and generalizing from, fingerprinting scripts we come across in the wild

@l0k9j
Copy link
Author

l0k9j commented Jul 10, 2020

@pes10k Thanks a lot for taking the time to respond, this is really appreciated.

Here's the value the accept-language header from three browsers on the same machine (Ubuntu 20.04, English only in my Operating System settings)

Brave:
en-GB,en-US;q=0.9,en;q=0.8,fr-FR;q=0.7,fr;q=0.6
Chrome:
en-GB,en-US;q=0.9,en;q=0.8
Firefox (with resistFingerprinting disabled, i.e. default setting):
en-US,en;q=0.5

Edit: it's based on language preference in the settings, I missed it as it's collapsed by default. So it's removable. My mistake.

@pes10k
Copy link
Contributor

pes10k commented Jul 10, 2020

Ah, that is very interesting and surprising! I'm going to break this into two issues then:

1. Make sure accept-lang settings are cleared / recalculated after removing spelling dictionaries
2. See if we can collapse more values in accept-lang, to reduce the identifiability there

Ah, nevermind about the above then, but I appreciate you taking the time to work through this here, and the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants