Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move fingerprintable APIs behind permissions #85

Closed
pes10k opened this issue Oct 24, 2019 · 46 comments · Fixed by #86
Closed

Move fingerprintable APIs behind permissions #85

pes10k opened this issue Oct 24, 2019 · 46 comments · Fixed by #86
Labels
privacy-tracker Group bringing to attention of Privacy, or tracked by the Privacy Group but not needing response.

Comments

@pes10k
Copy link

pes10k commented Oct 24, 2019

According to the authors of this paper, the standard includes functionality that allows users to be fingerprinted with high fidelity. (quote: “the DeviceMotionEvent APIs, particularly DeviceMotionEvent.acceleration and DeviceMotionEvent.rotationRate”) don’t require permissions to access. The standard needs to be updated so that users cannot be passively fingerpritined.

Migrated from w3c/sensors#398

@anssiko
Copy link
Member

anssiko commented Oct 25, 2019

@snyderp thanks for bringing this paper to the group's attention.

Since the paper was published, the group has done a number of privacy enhancements to the spec that I believe addressed the issue.

To summarize:

Further implementation experience is being gathered for the permission model and specification clarifications informed by this experience are being discussed in GitHub issue #74.

Please review the latest spec and let us know of any further changes you think are required to mitigate the identified vector adequately.

@pes10k
Copy link
Author

pes10k commented Oct 26, 2019

@anssiko thanks for the follow up! A couple of notes:

  1. re the requestPermission() update, i see your point that it seems to address the attack. I will follow up with the paper authors and see if they agree / have away of carrying out the attack otherwise and report back here.

  2. re: Making the security and privacy considerations mandatory, i think this is a great first step, but two remaining concerns:

  • I suggest adding a 4th MUST condition: "fire events after the first-party context has received a user gesture"
  • In general its rare to have mandatory material in these areas of specs; is it possible to move the same content elsewhere (e.g. into the algorithm descriptions), or at least call to these mandatory privacy requirements in the algorithm descriptions?

@pes10k
Copy link
Author

pes10k commented Oct 30, 2019

re 1. I just heard back from the paper authors. They think the permission dialog is sufficient to handle the majority of the problem (e.g. they don't see any un-permissioned access to the information) but they still think the sensors could be used to uniquely identify devices w/ high quality sensors / accelerometers.

To address the second problem, what do you think of limiting the precision of the relevant sensors? The authors suggest adding noise, but that would see to be difficult both implementation and useability wise. Simply caping the returned values to some max # of values / decimal points might be an easier-to-standardize, easier to reason-about option. WDYT?

@reillyeon
Copy link
Member

To address the second problem, what do you think of limiting the precision of the relevant sensors? The authors suggest adding noise, but that would see to be difficult both implementation and useability wise. Simply caping the returned values to some max # of values / decimal points might be an easier-to-standardize, easier to reason-about option. WDYT?

I think it depends on how much precision needs to be removed. If the fingerprintable identifier is just above the noise floor for these sensors then we should be able to remove it without measurably impacting applications. In contrast, mitigating concerns in the Ambient Light Sensor API required reducing precision significantly and thus required a deeper understanding of the use cases and how they would be affected.

@pes10k
Copy link
Author

pes10k commented Oct 31, 2019

@reillyeon Sounds good, i've gone back to the paper authors to see if they can give guidance on how much precision would need to be reduced to reduce identifiability. I expect it'll just be precision / bits-of-identification trade off, but there might be some sig threshold they can point to where the attack becomes infeasible, or more likely, as precision increases linearly, identification should (might?) increase exponentially, so there should be a useful / beneficial tradeoff to be made.

Either way, I'm speculating; i'll circle back in a few days or when i hear back from the researchers. Thanks!

@pes10k
Copy link
Author

pes10k commented Nov 7, 2019

@reillyeon I have heard back from the researchers, and they said they will not have time to do this analysis, though they expect it would be possible.

Given that they've already done the difficult work of documenting the issue, and believe it would be possible to prevent the attack through sensor accuracy reductions, I'd like to ask the WG to reach out to a domain expert and / or statistician to figure out how much accuracy reduction / limiting would be needed to address the problem, and to come up with a proposal to address the problem through accuracy reduction.

I will also ask in the next PING meeting and see if anyone there has expertise they can share too

@anssiko
Copy link
Member

anssiko commented Nov 19, 2019

@snyderp thanks for your help. It seems the next PING call will be 5 Dec 2019. It would be much appreciated if you could put this topic on the agenda, discuss within PING and report back.

All - if you feel like making a data-driven privacy-enhancing contribution, this is your opportunity to make a positive impact in this space, and get prominently acknowledged in relevant specifications for your contribution.

@pes10k
Copy link
Author

pes10k commented Nov 19, 2019

Hi @anssiko! This topic was on the agenda for our last call. We have a member seeing if someone on their team internally can work on this. I've reached out to them for an update.

That being said, I think it would still be good for the WG to also pursue researchers / statisticians as well, for their guidance too (internally or through outreach), as we'll need to find someone who can assist, and I expect we can reach out to non-overlapping groups

@pes10k
Copy link
Author

pes10k commented Nov 19, 2019

@anssiko it looks like the team I've contacted will not be able to complete the review by Dec 5, though possibly by a follow up call. I'm reaching out to other contacts, but in the meantime…

What does the WB think about formalizing the mitigation in the paper (noise injection) instead? To my mind it seems more complex, and so harder to standardize, but if the WG thinks that'd be viable, that'd also be a fine way forward. I still think its worth looking into capping resolution as a simpler possible mitigation, but curious on your thoughts for the above as a fallback option

@reillyeon
Copy link
Member

My reading of the paper is that both noise injection and resolution reduction require some knowledge about the underlying hardware sensor. Before this WG recommends a particular solution an investigation needs to be made into whether it is practical to implement across the most common platforms they target. In addition, implementations should likely consider whether or not the platform itself implements similar mitigations.

@pes10k
Copy link
Author

pes10k commented Nov 19, 2019

@reillyeon my understanding of the paper and from discussions with the authors is that the only connection to the hardware is that it requires hardware of high enough quality to fix. The expectation of the authors, and their findings in the paper, is that has hardware gets better, the problem will only get worse. Are you saying an investigation needs to be done about whether the attack is practical across common platforms (in which case, the paper seems to find "yes, and increasingly so") or whether the mitigation is practical?

I dont think the authors find any platforms that have intentional mitigations, only those with low-quality sensors that are "accidentally" protected.

@scheib
Copy link

scheib commented Dec 4, 2019

I agree we should mitigate privacy concerns.

From a specification point of view, I think we need to call out the concern and risk and note that user agents may mitigate concerns. I don't think the particular mitigations should be normative, as they may need to be adjusted over time and or different user agents may make different determinations over utility vs privacy.

From an implementation point of view, I think user agents need to develop practical model and implementation of the threat to have criteria to know if mitigations are effective. [edit: adding for clarity: That is, I think we need to develop a test that can validate there is a fingerprint concern. Then, mitigations can be tested against it.]

We must also determine if the threat should be mitigated at the user agent or operating system level.

@pes10k
Copy link
Author

pes10k commented Dec 4, 2019

@reillyeon @anssiko has your group had any success in finding an expert to look into accuracy-reduction defenses against the FP attack

@scheib punting privacy concerns as "implementation details" has been (to put it mildly) not successful privacy strategy on the web. If a group is saying "functionality should be included" (e.g. a standard) its beholden on them to say how it can be done so in a privacy preserving manner.

If things change, and protections need adjusting, then the standard would need to be updated, just as standards are updated for many other reasons.

From an implementation point of view, I think user agents need to develop practical model and implementation of the threat to have criteria to know if mitigations are effective.

If I understand this right, the idea is "ship it and we'll fix it after it ships"? Also, has not been a super successful privacy strategy on the web so far ;) Privacy focused vendors spend a nightmare amount of time trying to figure out how privacy protections can be bolted onto shipped functionality w/o breaking websites, and… its a loosing game. Get the privacy right from the start

@anssiko
Copy link
Member

anssiko commented Dec 5, 2019

@reillyeon @anssiko has your group had any success in finding an expert to look into accuracy-reduction defenses against the FP attack

Not yet. This is a multidimensional problem. We need someone who can look into implementation feasibility across all major platforms, evaluate the platform-level mitigations already in place, as well as understand whether and how the key use cases would be affected by these mitigations. Please feel free to recommend us people from privacy community with relevant expertise we could work with.

@pes10k
Copy link
Author

pes10k commented Dec 5, 2019

@anssiko as an update, it looks like a PING member from Berkeley, and a member of the Blink team, will both be looking into this a bit over the next two weeks. I believe they will specifically be looking into whether-and-how-much is needed for the accuracy reduction techniques to be successful. I will let you know what i hear back. If it would be useful too, i would be happy to try and put you all in touch too

@anssiko
Copy link
Member

anssiko commented Dec 6, 2019

I’d encourage involved folks to drop in updates into this issue. Thank you for helping us explore the solution space for this issue.

@npdoty
Copy link
Contributor

npdoty commented Jan 13, 2020

From my read of the paper, the authors suggest two mitigations that they seem to identify as resolving the issue altogether:

  1. adding random noise of -0.5-to-0.5 to each value and then rounding to the resolution of the sensor (16 bits, in the case of the iOS devices in question)
  2. rounding the output to the nearest multiple of the nominal gain of the sensor (61 or 70 millidegrees per second, in the case of the iOS devices in question)

Those mitigations do seem to be hardware specific (per @reillyeon), so we might not be able to include hard numbers in the spec, but could still provide a normative indication that sensors should be rounded using the provided algorithm, based on their hardware details. If nominal sensor gain and sensor resolution are known for each device (and accessible to the browser from the underlying operating system), then those mitigation algorithms could be implemented by all browser implementers.

I’m not entirely clear on this yet, but it also seems possible that we could do a survey of the accelerometers and gyroscopes currently in use by most mobile devices and then require no more precision than slightly below the typical or lowest precision sensor in the survey. Would that be a huge detriment to functionality? I’m not sure, but if I’m reading it correctly that the iOS devices measure a range of 4000 degrees per second with a resolution (and nominal gain) of 0.061, then we’d be talking about an extra error of 0.1 / 4000 = 0.0025%. That certainly sounds small, though I’m sure it depends on the particular use cases.

@anssiko
Copy link
Member

anssiko commented Jan 14, 2020

@npdoty, thanks for the update, very helpful.

Re use cases, I think WebVR/XR used to be the most demanding one. It has a strict motion-to-photon latency requirement in the ballpark of <20 ms to avoid motion sickness. That said, the Sensor APIs were used to polyfill those APIs and now that there's a dedicated WebXR Device API perhaps that use case is not that strong for Sensor APIs anymore.

Anyone have use cases in mind that would be compromised if we'd introduce the two proposed mitigations?

Any concerns on feasibility of implementation?

@pes10k
Copy link
Author

pes10k commented Jan 14, 2020

@anssiko Would the WG be up for having both the noise-injection and accuracy cap approaches included as necessary for correct implementation of the standard? If so, thats terrific! (assuming no blocking concerns to your feasibility question above)

@anssiko
Copy link
Member

anssiko commented Jan 15, 2020

Proposed mitigation from Chromium https://github.com/JensenPaul/sensor-fingerprint-mitigation via @JensenPaul.

@pes10k
Copy link
Author

pes10k commented Jan 15, 2020

Terrific! @anssiko would adding the mitigation described by @JensenPaul to the spec be controversial? If not, and if @npdoty thinks the proposed mitigation would solve the problem, then seems like we have a way forward. :)

@reillyeon
Copy link
Member

I'm in favor of adding this mitigation (or whatever the final version turns out to be) to the specification. In terms of requirements I think this should be a SHOULD rather than a MUST to give implementations room to delegate to mitigations implemented in the platform or improvements in the underlying sensor hardware.

@pes10k
Copy link
Author

pes10k commented Jan 15, 2020

I feel a bit nervous about this. Can we say a MUST if, or a MUST unless to allow that flexibility? I'm just very eager to avoid as much ambiguity as possible in the privacy aspect of specs (its a problem we're drowning in, and something that makes privacy research very, and unnecessarily, difficult)

@reillyeon
Copy link
Member

I don't see much of a difference between "SHOULD" and "MUST unless" given that the definition of "SHOULD" is "there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course."

I am not opposed to using "MUST" and spelling out more precisely what those particular circumstances must be.

@anssiko
Copy link
Member

anssiko commented Jan 29, 2020

Maybe we can decompose the proposed mitigation into smaller MUST and SHOULD requirements if that reflects reality better e.g. considering implementation constraints? Just an idea.

@maryammjd
Copy link

Hi everyone. Here are my comments on this issue:

1- The two suggestions in the SensorID paper would mitigate the attacks proposed in the same paper. There is not any assessment showing that these solutions will prevent similar attacks using other approaches (as a matter of fact, there isn't any assessment in the paper showing that the suggested solutions will prevent their own attacks and to what extent). There is this whole field of calibration techniques (which I am not expert in, see references 5,6, and 10 in the same paper).

2- Adding noises to sensor reading does not seem a practical solution to me. It has been in the literature for ages. But I think it inherently conflicts with the fact hat sensors are getting stronger and more accurate. In addition, there are research papers which show that even after applying noise, it is still possible to fingerprint devices to some extent (different applications).

3- It is true that after adding permission, the app can still fingerprint devices by using motion sensor. However, this is no different from other sources of fingerprinting such as Unique Device IDs (UID (advertising ID, phone no, device ID, unique hardware ID), and other personally identifiable information (PII). This is where the GDPR takes action. Although it is a little vague, but such information will eventually be classified as personal information.

GDPR Article 4, the GDPR gives the following definition for “personal data” (https://gdpr.eu/eu-gdpr-personal-data/): "‘Personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person."

In terms of possible recommendations in the specifications, I think unless we do a systematic study on the available implemented/recommended solutions in practice and literature, the Permissioning would do for now.

@npdoty
Copy link
Contributor

npdoty commented Feb 5, 2020

@maryammjd what do you think about the mitigation (in my last suggestion and fleshed out in the Chromium proposal) to decrease precision to above the nominal gain of the common sensors, in this case 0.1 degrees per second? While sensors may certainly get more precise in the future, setting this threshold now should also provide a mitigation against the calibration parameters of more precise future sensors. (Right?) That wouldn't rely on adding noise or any per-device configuration.

Setting the precision is a separate mitigation from permissions, and they could be combined.

See also this thread on the public-privacy list: https://lists.w3.org/Archives/Public/public-privacy/2020JanMar/0020.html

@pes10k
Copy link
Author

pes10k commented Feb 5, 2020

Wanted to second @npdoty above. Reducing accuracy seems (i) extremely unlikely to have web compat cost, (ii) to be future proof against higher rez sensors in the future re-introducing privacy harm (which might not be the case with noise inject, as noted above), and (iii) to allow people to use device sensors w/o making themselves globally identifiable.

@maryammjd I agree that GDPR can be a useful way to think about these issues, most people on the web are not covered by GDPR, so in most cases that is not a useful privacy constraint. Also, to echo @npdoty above, permissions seems necessary but insufficient. The standard needs to support use cases that allow using device hardware, without giving up full trackability.

@plehegar plehegar added the privacy-tracker Group bringing to attention of Privacy, or tracked by the Privacy Group but not needing response. label Feb 10, 2020
@maryammjd
Copy link

@maryammjd what do you think about the mitigation (in my last suggestion and fleshed out in the Chromium proposal) to decrease precision to above the nominal gain of the common sensors, in this case 0.1 degrees per second? While sensors may certainly get more precise in the future, setting this threshold now should also provide a mitigation against the calibration parameters of more precise future sensors. (Right?) That wouldn't rely on adding noise or any per-device configuration.

Setting the precision is a separate mitigation from permissions, and they could be combined.

See also this thread on the public-privacy list: https://lists.w3.org/Archives/Public/public-privacy/2020JanMar/0020.html

We are running experiments to see if the suggested countermeasures are indeed effective or not. I will keep you posted.

npdoty added a commit to npdoty/deviceorientation that referenced this issue May 29, 2020
… meters per second squared

updated examples to keep precision limits
updated privacy considerations to note sensor calibration as threat
added reference to sensorid paper

draft attempt to address w3c#85

h/t @JensenPaul for https://github.com/JensenPaul/sensor-fingerprint-mitigation
@npdoty npdoty linked a pull request Jun 1, 2020 that will close this issue
@shartoo
Copy link

shartoo commented Aug 10, 2020

Hi,sorry to disturb you ,anybody implemented the approach in the sensorid paper? I tried it on 20 android smart phones sensor data while no stable result return.Maybe this approach could only be a attack on ios devices and some Google Pixel but not all smart phones?

@pes10k
Copy link
Author

pes10k commented Aug 10, 2020

@shartoo my understanding from the authors is that the technique becomes more accurate on more recent devices, with more accurate sensors, and that the iPhone / pixel phones are only relevant because they had high precision sensors.

Could you be testing with cheaper devices? I was able to reproduce with demo code provided by the authors at the time of publication, but have not tried again sense

@reillyeon
Copy link
Member

Chromium has implemented the mitigation described in #86.

@pes10k
Copy link
Author

pes10k commented Aug 10, 2020

Absolute best possible answer :) Thanks @reillyeon for the update (and thanks @shartoo for the question!)

@shartoo
Copy link

shartoo commented Aug 11, 2020

@pes10k I've tested on Google Pixel 4 and Huawei P30 pro which were recently published devices.Here are some sample results

device direct ADC output(accelerometer(m^2/sec)) difference of sequence of ADC output
Google Piexel 4 acce_out_xyz_0 000000_10 acce_seq_substract_xyz_0 000000_10
Huawei P30 pro acce_out_xyz_0 000000_8 acce_seq_substract_xyz_0 000000_8

@shartoo
Copy link

shartoo commented Aug 11, 2020

But i could't got stable estimated gain matrix for every sample file(sample one device for 50 times and get one file every time,every sample file contains about 100 records of x,y,z),some of my estimated gain matrix :

  • Google Pixel 4
  • file 1: [[ 0. 0. 0. ]
    [ 0. 0. 0. ]
    [ 0. 0. 156.762112]]

  • file2 : [[52.2719942 52.2719942 52.2719942 ]
    [52.27195051 52.27195051 52.27195051]
    [52.25403733 52.25403733 52.25403733]]

i can't figure out which step i've done wrong.

@shartoo
Copy link

shartoo commented Aug 11, 2020

All of my sample data are sampled when the devie stay absolute still on the desk,thus the basic approach(not the improved approach) should work.

According to the description in the paper,the computation steps should be

  1. Data Preprocessing: compute the difference of data sequence
  2. ADC value recovery: take G_0(nominal gain times identity matrix ) as initial G and get an delta_A ,then around the A to integer values to check validity
  3. Gain matrix Estimation:

G

@pes10k
Copy link
Author

pes10k commented Aug 11, 2020

@shartoo thats fantastic that you're digging into this problem, though i think at this point you might have more luck reaching out to the paper authors, who might have more insight into if there is an issue in your implementation, or their approach, or if changes in browser behavior explain the difference. If you reach an answer though, i'd be very interested to know what you find out!

@shartoo
Copy link

shartoo commented Aug 11, 2020

I've tried to email two authors both but no response,so i wondered if the approach was really credible . But the result showed in the paper mainly on IOS,while mine are on Android ,i'm not sure . The data and result i got were from android app rather than brower . There is a fingerprintjs2 which can be embed into a brower,whose author is Peter too.

@anssiko
Copy link
Member

anssiko commented Aug 11, 2020

@shartoo, thank you for your data-driven contribution that seems to validate the attack described in the paper has been mitigated on Android for devices you used in your experiment.

@pes10k, related, I think we should acknowledge people who make positive contributions in the important area of privacy-enhancement for this specification. Feel free to think of people who we should thank in https://w3c.github.io/deviceorientation/#acknowledgments I would like to acknowledge your contributions, so you don’t need to self-promote.

@shartoo
Copy link

shartoo commented Aug 11, 2020

According to the paper, if any mitigatation had been made ,the output of ADC should be unstable .But as you seen above ,at least the ADC output of Google Pixel are very stable.I'm afraid the attack has not been mitigated yet.

@anssiko
Copy link
Member

anssiko commented Aug 11, 2020

@shartoo, to get a more complete picture, it would be helpful if you could publish the results from all the 20 devices you tested along with details on the OS and browser version(s). Thank you for your contribution.

@shartoo
Copy link

shartoo commented Aug 11, 2020

Here is the detail of some test device,not all data are include.

device os version(android) ADC output difference of ADC output
GooglePixel 4 11 acce_out_xyz_0 000000_10 acce_seq_substract_xyz_0 000000_10
Huawei Horor 9x 10 acce_out_xyz_0 000000_21 acce_seq_substract_xyz_0 000000_21
Huawei Mate 20 10 acce_out_xyz_0 000000_6 acce_seq_substract_xyz_0 000000_6
Huawei nova3 9 acce_out_xyz_0 000000_24 acce_seq_substract_xyz_0 000000_24
Huawei P30 pro 10 acce_out_xyz_0 000000_26 acce_seq_substract_xyz_0 000000_26
OPPO R9S 6.0.1 acce_out_xyz_0 000000_24 acce_seq_substract_xyz_0 000000_24
OPPO R11 7.1.1 acce_out_xyz_0 000000_36 acce_seq_substract_xyz_0 000000_36
VIVO X9 7.1.2 acce_out_xyz_0 000000_23 acce_seq_substract_xyz_0 000000_3

@anssiko
Copy link
Member

anssiko commented Aug 11, 2020

@shartoo, thank you. IIUC, your results are from an Android app. As for the web specification, this issue has been addressed in b95751e to advise web engine implementers on the mitigation, and I'm aware at least Chromium has implemented it: https://crbug.com/1018180

@reillyeon may know whether there's an open bug for Android where @shartoo could report these findings.

@maryammjd
Copy link

Hi Everyone,

We have been running experiments on Android devices as well as some gyro on some sensor kits via native apps to fingerprint them according to SensorID paper and got similar unstable results. These attacks work on the sensors that are factory-calibrated. According to the project webpage:
"Can we conduct the same attack to fingerprint other Android devices?
We have found that the accelerometer in Google Pixel 2 and Pixel 3 can be fingerprinted by our calibration fingerprinting attack. It is possible that some other Android devices are also factory calibrated and thus can be fingerprinted. However, we only have data from a few Android device models; the Android device models we have tested, apart from Google Pixel 2 and 3, cannot be fingerprinted using our approach."

@shartoo
Copy link

shartoo commented Aug 12, 2020

Hi @maryammjd i got relative stable result on OnePlusONEPLUS A6000 on accelerometer sensor data,some example result may looks like:

  • estimated gain matrix of sample file 1

[[313.97989797 -0.00022042 -0.00079312]
[ 0.00005471 313.98001487 -0.00009762]
[ 0.00196805 -0.02164853 313.95306167]]

  • estimated gain matrix of sample file 2

[[313.97999159 0.00022686 0.00109395]
[ -0.00001855 313.98011056 -0.00008026]
[ 0.00544453 0.01461957 313.97803559]]

  • estimated gain matrix of sample file 3

[[313.97990616 -0.00003449 0.00024145]
[ -0.00002185 313.98007057 0.00002185]
[ 0.01494681 0.00114975 313.97492772]]

  • estimated gain matrix of sample file 4

[[313.97998885 -0.00024576 0.00068462]
[ 0.00000731 313.98014976 0.00006495]
[ 0.0016579 -0.00273067 313.97497905]]

But this did not work on Huawei P30 pro at present and i have not yet done further test if this could distinguish two device of same hardware.

@maryammjd
Copy link

I found a new paper and was not sure if everyone has seen it:
The Seven Deadly Sins of the HTML5 WebAPI: A Large-scale Study on the Risks of Mobile Sensor-based Attacks.

I hope it is not behind a pay-wall for others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
privacy-tracker Group bringing to attention of Privacy, or tracked by the Privacy Group but not needing response.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants