-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move fingerprintable APIs behind permissions #85
Comments
@snyderp thanks for bringing this paper to the group's attention. Since the paper was published, the group has done a number of privacy enhancements to the spec that I believe addressed the issue. To summarize:
Further implementation experience is being gathered for the permission model and specification clarifications informed by this experience are being discussed in GitHub issue #74. Please review the latest spec and let us know of any further changes you think are required to mitigate the identified vector adequately. |
@anssiko thanks for the follow up! A couple of notes:
|
re 1. I just heard back from the paper authors. They think the permission dialog is sufficient to handle the majority of the problem (e.g. they don't see any un-permissioned access to the information) but they still think the sensors could be used to uniquely identify devices w/ high quality sensors / accelerometers. To address the second problem, what do you think of limiting the precision of the relevant sensors? The authors suggest adding noise, but that would see to be difficult both implementation and useability wise. Simply caping the returned values to some max # of values / decimal points might be an easier-to-standardize, easier to reason-about option. WDYT? |
I think it depends on how much precision needs to be removed. If the fingerprintable identifier is just above the noise floor for these sensors then we should be able to remove it without measurably impacting applications. In contrast, mitigating concerns in the Ambient Light Sensor API required reducing precision significantly and thus required a deeper understanding of the use cases and how they would be affected. |
@reillyeon Sounds good, i've gone back to the paper authors to see if they can give guidance on how much precision would need to be reduced to reduce identifiability. I expect it'll just be precision / bits-of-identification trade off, but there might be some sig threshold they can point to where the attack becomes infeasible, or more likely, as precision increases linearly, identification should (might?) increase exponentially, so there should be a useful / beneficial tradeoff to be made. Either way, I'm speculating; i'll circle back in a few days or when i hear back from the researchers. Thanks! |
@reillyeon I have heard back from the researchers, and they said they will not have time to do this analysis, though they expect it would be possible. Given that they've already done the difficult work of documenting the issue, and believe it would be possible to prevent the attack through sensor accuracy reductions, I'd like to ask the WG to reach out to a domain expert and / or statistician to figure out how much accuracy reduction / limiting would be needed to address the problem, and to come up with a proposal to address the problem through accuracy reduction. I will also ask in the next PING meeting and see if anyone there has expertise they can share too |
@snyderp thanks for your help. It seems the next PING call will be 5 Dec 2019. It would be much appreciated if you could put this topic on the agenda, discuss within PING and report back. All - if you feel like making a data-driven privacy-enhancing contribution, this is your opportunity to make a positive impact in this space, and get prominently acknowledged in relevant specifications for your contribution. |
Hi @anssiko! This topic was on the agenda for our last call. We have a member seeing if someone on their team internally can work on this. I've reached out to them for an update. That being said, I think it would still be good for the WG to also pursue researchers / statisticians as well, for their guidance too (internally or through outreach), as we'll need to find someone who can assist, and I expect we can reach out to non-overlapping groups |
@anssiko it looks like the team I've contacted will not be able to complete the review by Dec 5, though possibly by a follow up call. I'm reaching out to other contacts, but in the meantime… What does the WB think about formalizing the mitigation in the paper (noise injection) instead? To my mind it seems more complex, and so harder to standardize, but if the WG thinks that'd be viable, that'd also be a fine way forward. I still think its worth looking into capping resolution as a simpler possible mitigation, but curious on your thoughts for the above as a fallback option |
My reading of the paper is that both noise injection and resolution reduction require some knowledge about the underlying hardware sensor. Before this WG recommends a particular solution an investigation needs to be made into whether it is practical to implement across the most common platforms they target. In addition, implementations should likely consider whether or not the platform itself implements similar mitigations. |
@reillyeon my understanding of the paper and from discussions with the authors is that the only connection to the hardware is that it requires hardware of high enough quality to fix. The expectation of the authors, and their findings in the paper, is that has hardware gets better, the problem will only get worse. Are you saying an investigation needs to be done about whether the attack is practical across common platforms (in which case, the paper seems to find "yes, and increasingly so") or whether the mitigation is practical? I dont think the authors find any platforms that have intentional mitigations, only those with low-quality sensors that are "accidentally" protected. |
I agree we should mitigate privacy concerns. From a specification point of view, I think we need to call out the concern and risk and note that user agents may mitigate concerns. I don't think the particular mitigations should be normative, as they may need to be adjusted over time and or different user agents may make different determinations over utility vs privacy. From an implementation point of view, I think user agents need to develop practical model and implementation of the threat to have criteria to know if mitigations are effective. [edit: adding for clarity: That is, I think we need to develop a test that can validate there is a fingerprint concern. Then, mitigations can be tested against it.] We must also determine if the threat should be mitigated at the user agent or operating system level. |
@reillyeon @anssiko has your group had any success in finding an expert to look into accuracy-reduction defenses against the FP attack @scheib punting privacy concerns as "implementation details" has been (to put it mildly) not successful privacy strategy on the web. If a group is saying "functionality should be included" (e.g. a standard) its beholden on them to say how it can be done so in a privacy preserving manner. If things change, and protections need adjusting, then the standard would need to be updated, just as standards are updated for many other reasons.
If I understand this right, the idea is "ship it and we'll fix it after it ships"? Also, has not been a super successful privacy strategy on the web so far ;) Privacy focused vendors spend a nightmare amount of time trying to figure out how privacy protections can be bolted onto shipped functionality w/o breaking websites, and… its a loosing game. Get the privacy right from the start |
Not yet. This is a multidimensional problem. We need someone who can look into implementation feasibility across all major platforms, evaluate the platform-level mitigations already in place, as well as understand whether and how the key use cases would be affected by these mitigations. Please feel free to recommend us people from privacy community with relevant expertise we could work with. |
@anssiko as an update, it looks like a PING member from Berkeley, and a member of the Blink team, will both be looking into this a bit over the next two weeks. I believe they will specifically be looking into whether-and-how-much is needed for the accuracy reduction techniques to be successful. I will let you know what i hear back. If it would be useful too, i would be happy to try and put you all in touch too |
I’d encourage involved folks to drop in updates into this issue. Thank you for helping us explore the solution space for this issue. |
From my read of the paper, the authors suggest two mitigations that they seem to identify as resolving the issue altogether:
Those mitigations do seem to be hardware specific (per @reillyeon), so we might not be able to include hard numbers in the spec, but could still provide a normative indication that sensors should be rounded using the provided algorithm, based on their hardware details. If nominal sensor gain and sensor resolution are known for each device (and accessible to the browser from the underlying operating system), then those mitigation algorithms could be implemented by all browser implementers. I’m not entirely clear on this yet, but it also seems possible that we could do a survey of the accelerometers and gyroscopes currently in use by most mobile devices and then require no more precision than slightly below the typical or lowest precision sensor in the survey. Would that be a huge detriment to functionality? I’m not sure, but if I’m reading it correctly that the iOS devices measure a range of 4000 degrees per second with a resolution (and nominal gain) of 0.061, then we’d be talking about an extra error of 0.1 / 4000 = 0.0025%. That certainly sounds small, though I’m sure it depends on the particular use cases. |
@npdoty, thanks for the update, very helpful. Re use cases, I think WebVR/XR used to be the most demanding one. It has a strict motion-to-photon latency requirement in the ballpark of <20 ms to avoid motion sickness. That said, the Sensor APIs were used to polyfill those APIs and now that there's a dedicated WebXR Device API perhaps that use case is not that strong for Sensor APIs anymore. Anyone have use cases in mind that would be compromised if we'd introduce the two proposed mitigations? Any concerns on feasibility of implementation? |
@anssiko Would the WG be up for having both the noise-injection and accuracy cap approaches included as necessary for correct implementation of the standard? If so, thats terrific! (assuming no blocking concerns to your feasibility question above) |
Proposed mitigation from Chromium https://github.com/JensenPaul/sensor-fingerprint-mitigation via @JensenPaul. |
Terrific! @anssiko would adding the mitigation described by @JensenPaul to the spec be controversial? If not, and if @npdoty thinks the proposed mitigation would solve the problem, then seems like we have a way forward. :) |
I'm in favor of adding this mitigation (or whatever the final version turns out to be) to the specification. In terms of requirements I think this should be a |
I feel a bit nervous about this. Can we say a MUST if, or a MUST unless to allow that flexibility? I'm just very eager to avoid as much ambiguity as possible in the privacy aspect of specs (its a problem we're drowning in, and something that makes privacy research very, and unnecessarily, difficult) |
I don't see much of a difference between "SHOULD" and "MUST unless" given that the definition of "SHOULD" is "there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course." I am not opposed to using "MUST" and spelling out more precisely what those particular circumstances must be. |
Maybe we can decompose the proposed mitigation into smaller MUST and SHOULD requirements if that reflects reality better e.g. considering implementation constraints? Just an idea. |
Hi everyone. Here are my comments on this issue: 1- The two suggestions in the SensorID paper would mitigate the attacks proposed in the same paper. There is not any assessment showing that these solutions will prevent similar attacks using other approaches (as a matter of fact, there isn't any assessment in the paper showing that the suggested solutions will prevent their own attacks and to what extent). There is this whole field of calibration techniques (which I am not expert in, see references 5,6, and 10 in the same paper). 2- Adding noises to sensor reading does not seem a practical solution to me. It has been in the literature for ages. But I think it inherently conflicts with the fact hat sensors are getting stronger and more accurate. In addition, there are research papers which show that even after applying noise, it is still possible to fingerprint devices to some extent (different applications). 3- It is true that after adding permission, the app can still fingerprint devices by using motion sensor. However, this is no different from other sources of fingerprinting such as Unique Device IDs (UID (advertising ID, phone no, device ID, unique hardware ID), and other personally identifiable information (PII). This is where the GDPR takes action. Although it is a little vague, but such information will eventually be classified as personal information. GDPR Article 4, the GDPR gives the following definition for “personal data” (https://gdpr.eu/eu-gdpr-personal-data/): "‘Personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person." In terms of possible recommendations in the specifications, I think unless we do a systematic study on the available implemented/recommended solutions in practice and literature, the Permissioning would do for now. |
@maryammjd what do you think about the mitigation (in my last suggestion and fleshed out in the Chromium proposal) to decrease precision to above the nominal gain of the common sensors, in this case 0.1 degrees per second? While sensors may certainly get more precise in the future, setting this threshold now should also provide a mitigation against the calibration parameters of more precise future sensors. (Right?) That wouldn't rely on adding noise or any per-device configuration. Setting the precision is a separate mitigation from permissions, and they could be combined. See also this thread on the public-privacy list: https://lists.w3.org/Archives/Public/public-privacy/2020JanMar/0020.html |
Wanted to second @npdoty above. Reducing accuracy seems (i) extremely unlikely to have web compat cost, (ii) to be future proof against higher rez sensors in the future re-introducing privacy harm (which might not be the case with noise inject, as noted above), and (iii) to allow people to use device sensors w/o making themselves globally identifiable. @maryammjd I agree that GDPR can be a useful way to think about these issues, most people on the web are not covered by GDPR, so in most cases that is not a useful privacy constraint. Also, to echo @npdoty above, permissions seems necessary but insufficient. The standard needs to support use cases that allow using device hardware, without giving up full trackability. |
We are running experiments to see if the suggested countermeasures are indeed effective or not. I will keep you posted. |
… meters per second squared updated examples to keep precision limits updated privacy considerations to note sensor calibration as threat added reference to sensorid paper draft attempt to address w3c#85 h/t @JensenPaul for https://github.com/JensenPaul/sensor-fingerprint-mitigation
Hi,sorry to disturb you ,anybody implemented the approach in the sensorid paper? I tried it on 20 android smart phones sensor data while no stable result return.Maybe this approach could only be a attack on ios devices and some Google Pixel but not all smart phones? |
@shartoo my understanding from the authors is that the technique becomes more accurate on more recent devices, with more accurate sensors, and that the iPhone / pixel phones are only relevant because they had high precision sensors. Could you be testing with cheaper devices? I was able to reproduce with demo code provided by the authors at the time of publication, but have not tried again sense |
Chromium has implemented the mitigation described in #86. |
Absolute best possible answer :) Thanks @reillyeon for the update (and thanks @shartoo for the question!) |
@pes10k I've tested on Google Pixel 4 and Huawei P30 pro which were recently published devices.Here are some sample results
|
But i could't got stable estimated gain matrix for every sample file(sample one device for 50 times and get one file every time,every sample file contains about 100 records of x,y,z),some of my estimated gain matrix :
i can't figure out which step i've done wrong. |
All of my sample data are sampled when the devie stay absolute still on the desk,thus the basic approach(not the improved approach) should work. According to the description in the paper,the computation steps should be
|
@shartoo thats fantastic that you're digging into this problem, though i think at this point you might have more luck reaching out to the paper authors, who might have more insight into if there is an issue in your implementation, or their approach, or if changes in browser behavior explain the difference. If you reach an answer though, i'd be very interested to know what you find out! |
I've tried to email two authors both but no response,so i wondered if the approach was really credible . But the result showed in the paper mainly on IOS,while mine are on Android ,i'm not sure . The data and result i got were from android app rather than brower . There is a fingerprintjs2 which can be embed into a brower,whose author is Peter too. |
@shartoo, thank you for your data-driven contribution that seems to validate the attack described in the paper has been mitigated on Android for devices you used in your experiment. @pes10k, related, I think we should acknowledge people who make positive contributions in the important area of privacy-enhancement for this specification. Feel free to think of people who we should thank in https://w3c.github.io/deviceorientation/#acknowledgments I would like to acknowledge your contributions, so you don’t need to self-promote. |
According to the paper, if any mitigatation had been made ,the output of ADC should be unstable .But as you seen above ,at least the ADC output of Google Pixel are very stable.I'm afraid the attack has not been mitigated yet. |
@shartoo, to get a more complete picture, it would be helpful if you could publish the results from all the 20 devices you tested along with details on the OS and browser version(s). Thank you for your contribution. |
@shartoo, thank you. IIUC, your results are from an Android app. As for the web specification, this issue has been addressed in b95751e to advise web engine implementers on the mitigation, and I'm aware at least Chromium has implemented it: https://crbug.com/1018180 @reillyeon may know whether there's an open bug for Android where @shartoo could report these findings. |
Hi Everyone, We have been running experiments on Android devices as well as some gyro on some sensor kits via native apps to fingerprint them according to SensorID paper and got similar unstable results. These attacks work on the sensors that are factory-calibrated. According to the project webpage: |
Hi @maryammjd i got relative stable result on OnePlusONEPLUS A6000 on accelerometer sensor data,some example result may looks like:
[[313.97989797 -0.00022042 -0.00079312]
[[313.97999159 0.00022686 0.00109395]
[[313.97990616 -0.00003449 0.00024145]
[[313.97998885 -0.00024576 0.00068462] But this did not work on Huawei P30 pro at present and i have not yet done further test if this could distinguish two device of same hardware. |
I found a new paper and was not sure if everyone has seen it: I hope it is not behind a pay-wall for others. |
According to the authors of this paper, the standard includes functionality that allows users to be fingerprinted with high fidelity. (quote: “the DeviceMotionEvent APIs, particularly DeviceMotionEvent.acceleration and DeviceMotionEvent.rotationRate”) don’t require permissions to access. The standard needs to be updated so that users cannot be passively fingerpritined.
Migrated from w3c/sensors#398
The text was updated successfully, but these errors were encountered: