From a0bb87d45b35a74613d9b2416317066ef84efdc7 Mon Sep 17 00:00:00 2001 From: Penelope McLachlan Date: Tue, 19 Dec 2023 15:35:54 -0800 Subject: [PATCH 1/3] Update explainer.md - Edited for flow/readability - Added a challenge we intend to address with PEPC (insufficiency of existing mitigations) - Added a rejected alternative, an allow list based approach --- explainer.md | 631 ++++++++++++++++++++++++++++----------------------- 1 file changed, 341 insertions(+), 290 deletions(-) diff --git a/explainer.md b/explainer.md index 644f679..55e86bf 100644 --- a/explainer.md +++ b/explainer.md @@ -4,12 +4,12 @@ When making decisions about whether or not to expose particularly powerful capabilities to a given website, user agents generally -[pass the question on to users themselves](#permission-prompts-ux-evaluation). +[pass the question on to users](#permission-prompts-ux-evaluation). Historically, this began as a fairly direct passthrough: a site would ask for -some capability and the user agent would pop up a prompt asking them to make a +some capability and the user agent immediately prompts asking users to make a decision for the request. -Rapidly, user agents realized that a more opinionated approach was necessary to +Spam and abuse have forced user agents to take a more opinionated approach to protect users' security, privacy, and attention. A number of preconditions and mitigation measures have evolved, ranging from straightforward [user activation requirements](https://developer.mozilla.org/en-US/docs/Web/Security/User_activation), @@ -18,96 +18,107 @@ permanent "block" policies, or However these measures have limited effect [as indicated by metrics](#user-agent-abuse-mitigations). -There are three main challenges with the status quo: - -1. **Context**: Ideally, a site's developer will request access as part of a - contextual flow that helps users understand what's being asked for and why, - enabling quick and confident responses. Often, however, the permissions - requests are correlated poorly with user expectations, up to and including - prompts that come seemingly out of nowhere (see example 1). This places an - enormous burden on user agents' presentation of the request. However, the - user agent has no semantic understanding of events taking place in the - content area prior to the permission request. User agents could make better - decisions and provide better prompts if they could make well-founded - assumptions about the nature of the user's interaction in the content area, - and the user's intent. - - ![](images/image1.png) \ - _Example 1. A notification permission prompt on a news site (contents blurred), - shown after the user has clicked on the empty area next to the article content. - The user finds this prompt interruptive as they had no interest in subscribing - to notifications, and they will likely struggle to understand why the prompt was - shown to begin with._ - -1. **Location**: In the ideal case above, users will interact with something on - a site that triggers a prompt. In less ideal cases, the user might not have - interacted with anything at all, or they may have interacted with an element - that was unrelated to the request. Given this uncertainty, user agents rely - on common placement of the permission prompt, usually in the top-left of the - page. Even in the best case, this has the unfortunate effect of shifting the - point to which users need to pay attention from the thing they clicked on to - some distant part of the user agent's UI (see example 2). User agents could - make better decisions and provide better prompts if they could make - well-founded assumptions about the nature of the user's interaction in the - content area, and the user's current area of focus. - - ![](images/image2.png) \ - _Example 2. An example where the permission prompt is far away from the user's - current area of focus. The permission prompt was triggered because the user has - just clicked on the crosshair icon in the bottom right, but the prompt is easy - to miss since it's on the opposite side of the page._ - -1. **Regret**: When users decline to provide a capability to a given site, it - seems quite reasonable for user agents to suppress that site's future - requests for the same capability in order to avoid annoyance or abuse. That - said, users change their minds for good reasons, but sometimes struggle to - understand how to express that new decision to the user agent (see example - 3). In these cases, the user agent's desire to protect the user backfires, - leaving the user confused about how to make the site they want to use work - the way they want it to.User agents could make better decisions and provide - better re-prompt UI if they could make well-founded assumptions about the - nature of the user's interaction in the content area, and the user's intent. - - ![](images/image3.png) \ - _Example 3. An example where the user previously blocked camera and microphone - access, but has now just expressed a strong intention to re-enable them by clicking - the unmute buttons. Because the user agent has no insight into this interaction - in the content area, it is compelled to respect the user's previous decision. - Especially in a stressful scenario such as an important presentation, users will - struggle to navigate the settings surfaces to change the permission decision._ - -Overall, attempts to optimize the trade-off between usability and interruptions -have hit practical limits because, fundamentally, user agents +There are four main challenges with the status quo: + +1. **Insufficiency of existing mitigations** The present day permissions spam + and abuse mitigation approach has an architectural upper bound on user + protection because the model relies on the website to choose when to trigger + the permission request prompt rather than capturing a reliable signal of + user intent. Requiring a user gesture for the Permission API (or similar) + does not solve this problem as there are many ways of tricking a user into + providing such a gesture. + +1. **Context**: Ideally, a site's developer will request access as part of a + contextual flow that helps users understand what's being asked for and why, + enabling quick and confident responses. Often, however, permission requests + are correlated poorly with user expectations, up to and including prompts + that can come out of nowhere (see example 1). This places a burden on user + agents' presentation of the request. The user agent has no semantic + understanding of events taking place in the content area prior to the + permission request. User agents could make better decisions and provide + better prompts if they could make well-founded assumptions about the nature + of the user's interaction in the content area, and the user's intent. + + ![](images/image1.png) \ + *Example 1. A notification permission prompt on a news site (contents + blurred), shown after the user has clicked on the empty area next to the + article content. The user finds this prompt interruptive as they had no + interest in subscribing to notifications, and they will likely struggle to + understand why the prompt was shown to begin with.* + +1. **Location**: In the ideal case above, users will interact with something on + a site that triggers a prompt. In less ideal cases, the user might not have + interacted with anything at all, or they may have interacted with an element + that was unrelated to the request. Given this uncertainty, user agents rely + on common placement of the permission prompt, usually in the top-left of the + page. Even in the best case, this has the unfortunate effect of shifting the + point to which users need to pay attention from the thing they clicked on to + some distant part of the user agent's UI (see example 2). User agents could + make better decisions and provide better prompts if they could make + well-founded assumptions about the nature of the user's interaction in the + content area, and the user's current area of focus. In effect, we think + there is a benefit to semantic markup for permissions. + + ![](images/image2.png) \ + *Example 2. An example where the permission prompt is far away from the + user's current area of focus. The permission prompt was triggered because + the user has just clicked on the crosshair icon in the bottom right, but the + prompt is easy to miss since it's on the opposite side of the page.* + +1. **Regret**: Given the challenges of permission annoyance and abuse, it is + reasonable for user agents to suppress a site's future requests for the same + capability when the first request is blocked. That said, our research shows + that users can and do change their minds for good reasons. When they change + their mind, the site can no longer offer an interface in web content and the + user must search for the appropriate user agent surface. Our research shows + that users often fail when trying to do so (see example 3). In these cases, + the user agent's desire to protect the user backfires, and makes the user's + experience worse as the site will not work as the user wants. User agents + can help users recover from a permission regret state if they can make + well-founded assumptions about the nature of the user's intent and + interaction with web content. + + ![](images/image3.png) \ + *Example 3. An example where the user previously blocked camera and + microphone access, but has now just expressed a strong intention to + re-enable them by clicking the unmute buttons. Because the user agent has no + insight into this interaction in the content area, it is compelled to + respect the user's previous decision. Especially in a stressful scenario + such as an important presentation, users will struggle to navigate the + settings surfaces to change the permission decision.* + +Optimizing the trade-off between usability and interruptions hit practical +limits because, fundamentally, user agents [still lack any understanding of the](#permission-prompts-ux-evaluation) semantics of user interactions in the content area (i.e. the web page), and consequently lack insight into the user's present context and task they are trying to accomplish. -To be able to meaningfully improve upon the status quo, user agents need to be -able to extract more trustworthy signals from the content area about the user's -task and intent, so they can be more opinionated and confident in their -communication to users regarding capability access. This is especially important -if user agents want to safely enable users to change their minds without -abdicating their responsibility for representing users' earlier permanent block -decisions. +To improve upon the status quo, user agents need to be able to extract +trustworthy signals from the content about the user's task and intent, so they +can be more opinionated and confident in their communication to users regarding +capability access. This is especially important if user agents want to safely +enable users to change their minds while still *respecting user's earlier +permanent block decisions*. ## Proposal -_Summary: We propose adding a new HTML element to the web platform which will be -used to provide an in-content entry point to permission requests. This HTML -element will look like a button and be used just like any other HTML element. -The key difference is that clicking this button will trigger a permission -request for which the user agent can have good confidence that it was -user-initiated._ _We propose the name of "Page Embedded Permission Control" -which can be abbreviated as PEPC._ +*Summary: We propose a new HTML element to the web platform which will be used +to provide an in-content entry point to permission requests. This HTML element +will look like a button and be used just like any other HTML element. The key +difference is that clicking this button will trigger a permission request for +which the user agent can have good confidence that it was user-initiated. The +element will have appropriate safeguards to protect users from common spam and +abuse patterns such as click jacking.* *We propose the name "Page Embedded +Permission Control" which can be abbreviated as PEPC.* To extract a strong signal of user intent, we believe that user agents require -some verification of the user interaction step that happened in the content area +verification of the user interaction step that happened in the content area directly before the developer triggers the showing of the permission prompt. We propose to achieve this through introducing a `` element: a -semi-trusted UI element that the developer can embed into the content area. At -its simplest, the element takes the shape of a button whose +semantic and semi-trusted UI element that the developer can embed into the +content area. At its simplest, the element takes the shape of a button whose [appearance](#locking-the-pepc-style) and [behavior](#restrictions) are materially [controlled](#security) by the user agent, to the extent that is necessary to ensure interaction with this element is a strong indication of user @@ -118,38 +129,40 @@ today, either as part of their onboarding experience, or as a permanently displayed affordance on their UI. These developers invite the user to click on a button to indicate interest, and see grant rates as high as 95% in the permission prompts that follow. For these developers, the permission element -will be a drop-in replacement that is straightforward to adopt. Here are some -real-life examples: +will be a drop-in replacement that is straightforward to adopt and easy to +polyfill on browsers which do not support the PEPC. Here are some real-life +examples: ![](images/image4.png) \ -_Example 4: A video-conferencing site. Clicking on the "Enable camera" button triggers -a camera permission request._ +*Example 4: A video-conferencing site. Clicking on the "Enable camera" button +triggers a camera permission request.* ![](images/image5.png) \ -_Example 5: A search site. Clicking on "Use precise location" triggers a geolocation -permission request._ +*Example 5: A search site. Clicking on "Use precise location" triggers a +geolocation permission request.* ![](images/image6.png) \ -_Example 6: A messaging site, clicking on the "Enable Desktop Notifications" button -triggers a push notifications permission request._ +*Example 6: A messaging site, clicking on the "Enable Desktop Notifications" +button triggers a push notifications permission request.* We believe that enshrining such a user-initiated approach in standards can contribute to consistently better permission request flows across the web. This is because the permission element offers the following compelling advantages to users and developers alike: -- It is **non-interruptive**: it is static, small, and contained in the content - area on the same z-level. -- It is **discoverable**: it can be placed by the developer within the user's - focus of attention; with the locality making it easier to find and more - convenient to interact with. -- It provides more **contextual information**: it has a visual manifestation as - opposed to being a procedural API, requiring developers to think about - integrating it into the user journey at UX design time, as opposed to being - left as an afterthought during implementation, resulting in knock-on effects - relating to clearer context. -- It allows users to **revert** a previous "deny" decision if they have changed - their mind and are now interested in the feature that the site provides. +- It is **non-interruptive**: it is static, small, and contained in the + content area on the same z-level. +- It is **discoverable**: it can be placed by the developer within the user's + focus of attention; with the locality making it easier to find and more + convenient to interact with. +- It provides more **contextual information**: it has a visual manifestation + as opposed to being a procedural API, requiring developers to think about + integrating it into the user journey at UX design time, as opposed to being + left as an afterthought during implementation, resulting in knock-on effects + relating to clearer context. +- It allows users to **revert** a previous "deny" decision if they have + changed their mind and are now interested in the feature that the site + provides. Example usage: @@ -287,7 +300,7 @@ interaction with the PEPC (for example, user agents generally allow users to control permissions on various UI surfaces that are entirely separate from the site's rendering area). Therefore events specific to the PEPC will only deal with the user's actions on the Permission UI, and specifically with the user -closing it either by dismissing it or by taking some other action on it that +closing it either by dismissing it or by taking some other action on it that causes it to close (e.g. they accept it). This allows sites to respond to this event by providing more context to potentially help the user make a decision. These two events will be added to @@ -295,15 +308,16 @@ These two events will be added to only target `permission` HTML elements. They do not bubble and are not cancelable. -- `ondismiss` - raised when the permission UI triggered by the PEPC has - been dismissed by the user (for example via clicking the 'x' button or - clicking outside the prompt) -- `onresolve` - raised when the permission UI triggered by the PEPC has - been resolved by the user taking some action on the prompt itself. Note that - this does not necessarily mean the permission state has changed, the user - might have taken an action that maintains the status quo (such as an action - that continues allowing a permission on a - [previously granted](#ui-when-the-permission-is-already-granted) type of UI). +- `ondismiss` - raised when the permission UI triggered by the PEPC has been + dismissed by the user (for example via clicking the 'x' button or clicking + outside the prompt) +- `onresolve` - raised when the permission UI triggered by the PEPC has been + resolved by the user taking some action on the prompt itself. Note that this + does not necessarily mean the permission state has changed, the user might + have taken an action that maintains the status quo (such as an action that + continues allowing a permission on a + [previously granted](#ui-when-the-permission-is-already-granted) type of + UI). Example usage: @@ -421,12 +435,12 @@ permission and to potentially allow the user to configure their decision. It is up to the user agent to design this confirmation UI, however there are some considerations that should be taken into account: -- The user agent should consider different UI for different scenarios based on - the current permission status -- The user agent should consider making use of the PEPC relative page position -- The user agent should consider how the PEPC interacts with any mechanisms they - have in place that would normally prevent permission request from reaching the - user +- The user agent should consider different UI for different scenarios based on + the current permission status +- The user agent should consider making use of the PEPC relative page position +- The user agent should consider how the PEPC interacts with any mechanisms + they have in place that would normally prevent permission request from + reaching the user #### Standard UI @@ -445,13 +459,13 @@ And a close-up of just the confirmation UI: Key points to consider: -- The confirmation UI can make use of the PEPC position to position itself on - the screen -- The confirmation UI can be brought more into attention by the user agent. In - the example above this is done by the user agent applying a gray filter over - the site content area -- The confirmation UI should have an obvious way for the user to change their - mind +- The confirmation UI can make use of the PEPC position to position itself on + the screen +- The confirmation UI can be brought more into attention by the user agent. In + the example above this is done by the user agent applying a gray filter over + the site content area +- The confirmation UI should have an obvious way for the user to change their + mind #### UI when the user can't change the permission @@ -508,30 +522,31 @@ The goal of user agents should be to ensure that the PEPC is not trivial to abuse. Therefore the user agent should consider the following potentially malicious tactics and mitigate them: -- The site could trick the user by choosing some misleading text (e.g. "Click - here to proceed"). Therefore the text on the PEPC should not be able to be set - by the site, instead the user agent should make sure to set it to something - comprehensive (e.g. "Share location" for a geolocation PEPC). \ - **Open question:**should there be a mechanism that allows the site to pick - one of several flavors of text (example: "Share location" vs "Use location")? -- The style of the PEPC can be set to obscure the purpose (e.g. setting the same - text color and button color would make the text unreadable). Therefore the - style should be verified, validated and overridden by the user agent as - needed. More details in the [Locking the PEPC style](#locking-the-pepc-style) - section -- The PEPC might be partially covered (to hide the text) with another HTML - element. Therefore the user agent should verify that the PEPC has been visible - already for some short time (e.g. 500ms or so) before it's clicked. User - agents that implement the - [IntersectionObserverV2](https://github.com/w3c/IntersectionObserver/blob/v2/explainer.md) - API can make use of it internally. -- The site might try to obtain a click on the PEPC by moving it where the user - is about to click. Therefore the user agent should ensure that the PEPC has - not been moved recently (e.g. in the past 500ms or so). -- The site might try to obtain a click on the PEPC by inserting it into the DOM - where the user is about to click. Therefore the user agent should ensure that - the PEPC has not been inserted into the DOM recently (e.g. in the past 500ms - or so). +- The site could trick the user by choosing some misleading text (e.g. "Click + here to proceed"). Therefore the text on the PEPC should not be able to be + set by the site, instead the user agent should make sure to set it to + something comprehensive (e.g. "Share location" for a geolocation PEPC). \ + **Open question:**should there be a mechanism that allows the site to pick + one of several flavors of text (example: "Share location" vs "Use + location")? +- The style of the PEPC can be set to obscure the purpose (e.g. setting the + same text color and button color would make the text unreadable). Therefore + the style should be verified, validated and overridden by the user agent as + needed. More details in the + [Locking the PEPC style](#locking-the-pepc-style) section +- The PEPC might be partially covered (to hide the text) with another HTML + element. Therefore the user agent should verify that the PEPC has been + visible already for some short time (e.g. 500ms or so) before it's clicked. + User agents that implement the + [IntersectionObserverV2](https://github.com/w3c/IntersectionObserver/blob/v2/explainer.md) + API can make use of it internally. +- The site might try to obtain a click on the PEPC by moving it where the user + is about to click. Therefore the user agent should ensure that the PEPC has + not been moved recently (e.g. in the past 500ms or so). +- The site might try to obtain a click on the PEPC by inserting it into the + DOM where the user is about to click. Therefore the user agent should ensure + that the PEPC has not been inserted into the DOM recently (e.g. in the past + 500ms or so). The user agent-rendered confirmation UI after the user clicks on the PEPC is what makes the PEPC ultimately secure. User agents should take proper care to @@ -550,27 +565,27 @@ should then be. There are 3 main possible approaches to consider, if the integrity of the PEPC click is not assured: -- The click triggers the legacy permission flow (as if it was triggered by the - equivalent JS API). This approach is worth considering if the failing check or - mitigation is not something self-correcting (e.g. styling issue or the PEPC - being covered). -- The click does nothing. This approach is worth considering if the failing - check or mitigation will self-correct itself (e.g. if the PEPC has moved - recently there will be a short cooldown before the PEPC integrity is - restored). -- The PEPC could be corrected by the user agent itself in order to preserve its - integrity. For example, if the style specified by the site sets the PEPC font - to be too small to read, this can be corrected by the user agent by forcing a - minimum font size on the PEPC. This should be considered primarily in the case - of CSS, which the user agent can override as it sees fit. **Open question:** - should there be a way for sites to specify whether they want to allow the user - agent to override the style? Some site authors might be happier triggering the - legacy prompt request flow, rather than have the PEPC style be changed whereas - others might prioritize the benefits of a PEPC permission flow over making - sure the style is exactly as desired. User agents need to weigh in the - additional flexibility afforded to site authors against the potential user - confusion of seeing the PEPC permission prompt vs the regular permission - prompt. +- The click triggers the legacy permission flow (as if it was triggered by the + equivalent JS API). This approach is worth considering if the failing check + or mitigation is not something self-correcting (e.g. styling issue or the + PEPC being covered). +- The click does nothing. This approach is worth considering if the failing + check or mitigation will self-correct itself (e.g. if the PEPC has moved + recently there will be a short cooldown before the PEPC integrity is + restored). +- The PEPC could be corrected by the user agent itself in order to preserve + its integrity. For example, if the style specified by the site sets the PEPC + font to be too small to read, this can be corrected by the user agent by + forcing a minimum font size on the PEPC. This should be considered primarily + in the case of CSS, which the user agent can override as it sees fit. **Open + question:** should there be a way for sites to specify whether they want to + allow the user agent to override the style? Some site authors might be + happier triggering the legacy prompt request flow, rather than have the PEPC + style be changed whereas others might prioritize the benefits of a PEPC + permission flow over making sure the style is exactly as desired. User + agents need to weigh in the additional flexibility afforded to site authors + against the potential user confusion of seeing the PEPC permission prompt vs + the regular permission prompt. ### Locking the PEPC style @@ -689,11 +704,11 @@ limit of at most one PEPC per permission type, per page. Subframe usage will be allowed but several security constraints need to be enforced: -- Permission Policy should be first checked to ensure that the permission is - allowed in the subframe. -- A valid `X-Frame-Options` header or a `frame-ancestors` CSP policy needs to be - set to prevent clickjacking attacks where a malicious site embeds a legitimate - site that uses a PEPC. +- Permission Policy should be first checked to ensure that the permission is + allowed in the subframe. +- A valid `X-Frame-Options` header or a `frame-ancestors` CSP policy needs to + be set to prevent clickjacking attacks where a malicious site embeds a + legitimate site that uses a PEPC. ### Custom cursors @@ -717,7 +732,7 @@ site needs to know. Information that can be already determined (for example via the Permissions API) is fine to be exposed via the PEPC. Other sensitive information should not be. -_Example:_ Many user agents provide a way for an admin to manage certain +*Example:* Many user agents provide a way for an admin to manage certain permissions on behalf of the user. In such cases the user agent might decide to have the PEPC text reflect this state, perhaps by setting the PEPC text to "Admin blocked". This would however provide information to the site that they @@ -736,38 +751,38 @@ trigger the permission prompt if it is indeed needed. The permission prompts are drawn starting from a fixed point above the web content area. ![](images/image20.png) \ -_Example notification permission prompt on Chrome_ +*Example notification permission prompt on Chrome* ![](images/image21.png) \ -_Example location permission prompt on Firefox_ +*Example location permission prompt on Firefox* In order to evaluate the user experience for these prompts **from the perspective of the user agent**, there are some questions that can be considered: -1. Does the user notice this prompt or is their attention engaged elsewhere? \ - The prompt was triggered by the site using a JavaScript API at some point that - might seem entirely arbitrary (from the user's perspective). There is no way to - tell whether this has any connection to what the user is currently doing, which - significantly increases the chance that the prompt will simply go unnoticed by - the user. -1. Does the user understand what the site feature that triggered this permission - request does? Can they weigh the potential benefit that the feature can - provide them against the potential downsides? The user might understand and - be aware of why a site might request their permission to access some powerful - feature, or they might have no context for this and there are no signals to - distinguish between these scenarios. -1. Does the user have any interest in the site feature that requires their - permission? It could certainly be the case that this prompt is in response to - the user showing interest in some feature (e.g. by pressing a button that - says "Use my location" on a food delivery service site), but it could also be - the case that the user is not interested in the feature at all. -1. If the user chooses to deny the permission request here, will they know how - to revisit this decision in the future should they change their mind? Many - user agents implement some form of temporary or permanent deny decision - policy to prevent sites from spamming permission prompt requests. However - this makes it difficult for sites to recover from this state even if the user - shows clear interest in the feature. +1. Does the user notice this prompt or is their attention engaged elsewhere? \ + The prompt was triggered by the site using a JavaScript API at some point + that might seem entirely arbitrary (from the user's perspective). There is + no way to tell whether this has any connection to what the user is currently + doing, which significantly increases the chance that the prompt will simply + go unnoticed by the user. +1. Does the user understand what the site feature that triggered this + permission request does? Can they weigh the potential benefit that the + feature can provide them against the potential downsides? The user might + understand and be aware of why a site might request their permission to + access some powerful feature, or they might have no context for this and + there are no signals to distinguish between these scenarios. +1. Does the user have any interest in the site feature that requires their + permission? It could certainly be the case that this prompt is in response + to the user showing interest in some feature (e.g. by pressing a button that + says "Use my location" on a food delivery service site), but it could also + be the case that the user is not interested in the feature at all. +1. If the user chooses to deny the permission request here, will they know how + to revisit this decision in the future should they change their mind? Many + user agents implement some form of temporary or permanent deny decision + policy to prevent sites from spamming permission prompt requests. However + this makes it difficult for sites to recover from this state even if the + user shows clear interest in the feature. ### User Agent abuse mitigations @@ -775,16 +790,16 @@ The shortcomings of the current status quo of permission prompts practically has the side-effect that user agents need to be quite defensive to shield users from unwanted permission prompts: -1. Many user agents implement a "permanent deny" policy, and other user agents - offer it as an option in the permission prompt. This means that a site will - not be able to ask for permission again after the user has blocked it. - Sometimes this is for some fixed (or increasing) duration, not strictly - speaking permanent. This helps prevent unwanted permission prompt spam though - it can sometimes lead to user confusion if they wish to change their mind - later as it requires them to discover the appropriate UI that allows them to - make the change manually. -1. Some user agents use heuristics, blocklists or ML-powered algorithms in an - effort to shield users from unwanted permission prompts. +1. Many user agents implement a "permanent deny" policy, and other user agents + offer it as an option in the permission prompt. This means that a site will + not be able to ask for permission again after the user has blocked it. + Sometimes this is for some fixed (or increasing) duration, not strictly + speaking permanent. This helps prevent unwanted permission prompt spam + though it can sometimes lead to user confusion if they wish to change their + mind later as it requires them to discover the appropriate UI that allows + them to make the change manually. +1. Some user agents use heuristics, blocklists or ML-powered algorithms in an + effort to shield users from unwanted permission prompts. Even with these measures in place, most user interactions on permission prompts are negative. For notifications (the most requested permission type), Google @@ -795,16 +810,16 @@ dismissed or blocked by the user add up to approx 92% on desktop platforms and A permission model designed to be initiated by the user would solve these issues. If the user initiates the permission request it ensures that: -1. The user understands the purpose of the permission, or at least has enough - **context** to feel comfortable engaging in an activity that uses this - permission. -1. The user's current flow or task is related to granting this permission and as - such it's unlikely that the permission request could be **interruptive**. -1. The user agent can ensure the subsequent UI is placed near the current - **focus** of attention of the user. This is because the user has just - interacted with some piece of UI to request the permission which means their - focus is likely in the area. Because of the above, it is unlikely that such a - placement is interruptive or annoying. +1. The user understands the purpose of the permission, or at least has enough + **context** to feel comfortable engaging in an activity that uses this + permission. +1. The user's current flow or task is related to granting this permission and + as such it's unlikely that the permission request could be **interruptive**. +1. The user agent can ensure the subsequent UI is placed near the current + **focus** of attention of the user. This is because the user has just + interacted with some piece of UI to request the permission which means their + focus is likely in the area. Because of the above, it is unlikely that such + a placement is interruptive or annoying. ## Alternatives considered @@ -841,30 +856,33 @@ This could be an example of how this would look like: Disadvantages: -1. `button` - 1. Backwards-compatibility and interoperability: old versions and user agents - that don't implement the permission element will still render and create a - `button` element that does not do anything. This is a worse experience if - not compensated with some other solution (e.g. a polyfill). - 1. Flexibility: this proposal generally imagines the HTML control as a - button, but future extensions of this element could instead use some - different type of UI like a checkbox, a link, a radio etc. - 1. Counter-intuitive: buttons usually have a lot more flexibility than the - PEPC has (e.g. the button text is set by the author). A site author using - the PEPC would have to be always aware of the differences between the PEPC - and a regular button. If the behavior between elements is significantly - different then it makes sense that they should be distinct elements. -1. `input` - 1. The `input` - [element represents a typed data field, usually with a form control to allow the user to edit the data](https://html.spec.whatwg.org/multipage/input.html#the-input-element). - Different input types are designed generally to be used as part of a form - that the user enters data into and submits. While some exceptions exist - (e.g. ``), they still represent controls that are - supposed to integrate within a form (a `submit` or `reset` button, a - hidden field etc.). Since there is no connection between forms and PEPC, - adding a new input - [type](https://html.spec.whatwg.org/multipage/input.html#attr-input-type) - would be a poor design fit. +1. `button` + 1. Backwards-compatibility and interoperability: old versions and user + agents that don't implement the permission element will still render and + create a `button` element that does not do anything. This is a worse + experience if not compensated with some other solution (e.g. a + polyfill). + 1. Flexibility: this proposal generally imagines the HTML control as a + button, but future extensions of this element could instead use some + different type of UI like a checkbox, a link, a radio etc. + 1. Counter-intuitive: buttons usually have a lot more flexibility than the + PEPC has (e.g. the button text is set by the author). A site author + using the PEPC would have to be always aware of the differences between + the PEPC and a regular button. If the behavior between elements is + significantly different then it makes sense that they should be distinct + elements. +1. `input` + 1. The `input` [element represents a typed data field, usually with a form + control to allow the user to edit the + data](https://html.spec.whatwg.org/multipage/input.html#the-input-element). + Different input types are designed generally to be used as part of a + form that the user enters data into and submits. While some exceptions + exist (e.g. ``), they still represent controls that + are supposed to integrate within a form (a `submit` or `reset` button, a + hidden field etc.). Since there is no connection between forms and PEPC, + adding a new input + [type](https://html.spec.whatwg.org/multipage/input.html#attr-input-type) + would be a poor design fit. ### No platform changes @@ -874,12 +892,13 @@ this pattern via articles, communications etc. Disadvantages: -1. There is no signal or guarantee indicating the user's intent. This means that - the user agent still needs to remain defensive about permission requests. -1. It requires user experience design and consideration from the site's side. - There are many ways to get this wrong and provide a suboptimal user - experience. Also, providing a solution with best-practices built in helps - resource-constrained development teams more. +1. There is no signal or guarantee indicating the user's intent. This means + that the user agent still needs to remain defensive about permission + requests. +1. It requires user experience design and consideration from the site's side. + There are many ways to get this wrong and provide a suboptimal user + experience. Also, providing a solution with best-practices built in helps + resource-constrained development teams more. ### Providing a registration JS API @@ -902,21 +921,21 @@ page. Disadvantages: -1. This does not solve the problem of permissions not really being brought into - focus in the interaction design process. -1. The possibility of dynamically selecting which element is the PEPC - complicates the verification and constraints we recommend as part of - security. It is more robust for the same element to either always be a PEPC - or not. -1. Backwards-compatibility and interoperability: developers need to always be - careful to manually remove their HTML button that they planned to declare as - a PECP if the user agent does not implement the PEPC API, otherwise their - site will simply contain a button that does nothing. -1. Counter-intuitive: buttons usually have a lot more flexibility than the PEPC - has (e.g. the button text is set by the author). A site author using the PEPC - would have to be always aware of the differences between the PEPC and a - regular button. If the behavior between elements is significantly different - then it makes sense that they should be distinct elements. +1. This does not solve the problem of permissions not really being brought into + focus in the interaction design process. +1. The possibility of dynamically selecting which element is the PEPC + complicates the verification and constraints we recommend as part of + security. It is more robust for the same element to either always be a PEPC + or not. +1. Backwards-compatibility and interoperability: developers need to always be + careful to manually remove their HTML button that they planned to declare as + a PECP if the user agent does not implement the PEPC API, otherwise their + site will simply contain a button that does nothing. +1. Counter-intuitive: buttons usually have a lot more flexibility than the PEPC + has (e.g. the button text is set by the author). A site author using the + PEPC would have to be always aware of the differences between the PEPC and a + regular button. If the behavior between elements is significantly different + then it makes sense that they should be distinct elements. ### Extending the Permissions API to provide an anchor point @@ -940,13 +959,14 @@ anchor. Disadvantages: -1. There is no signal of the user's intent and therefore user agents can not - make any of the improvements listed in the sections above, except for - positioning the prompt. However the user agent will still need to remain - defensive and make sure the user is protected against permission prompt spam. -1. This opens the permission prompt more to abuse as it allows malicious sites - to position it without having implemented any of the restriction or security - mechanisms that a PEPC would have. +1. There is no signal of the user's intent and therefore user agents can not + make any of the improvements listed in the sections above, except for + positioning the prompt. However the user agent will still need to remain + defensive and make sure the user is protected against permission prompt + spam. +1. This opens the permission prompt more to abuse as it allows malicious sites + to position it without having implemented any of the restriction or security + mechanisms that a PEPC would have. ### Allowing recovery via the regular permission flow @@ -955,25 +975,56 @@ allow users to recover from situations where the permission is blocked. However this needs to be balanced with protecting users from spam from bad actors on the web. There are some potential approaches to consider: -1. Some reputation-based mechanism that allows certain origins to recover from - a blocked permission state. This raises difficult ethical and technical - questions depending on which entity decides the how origin reputation is - calculated, and how a fair algorithm could be designed. - The ethical risk is that limiting access to powerful APIs based on origin - reputation is a dangerous feature that can potentially allow bad actors to - attempt to game the reputation algorithm (in their favor, or for a - competitor in their disfavor), and even the user agent itself could use this - algorithm to unfairly favor certain proprietary origins. - The technical difficulty consists of designing an algorithm that is fair - and precise. It needs to have a precision comparable to the precision of the - `` element signal of user intent. -1. A heuristic could be used to allow recovering from a blocked permissions - state based on various aspects of the user interaction on the site, previous - user action history, time since permission has been blocked, etc. However it - is very unlikely that the precision of such a heuristic would get even close - to the direct signal raised by the user's interaction with the `` - element. The usefulness of an unpredictable heuristic that "sometimes" allows - recovery makes for a bad developer and user experience. +1. Some reputation-based mechanism that allows certain origins to recover from + a blocked permission state. This raises difficult ethical and technical + questions depending on which entity decides the how origin reputation is + calculated, and how a fair algorithm could be designed. The ethical risk is + that limiting access to powerful APIs based on origin reputation is a + dangerous feature that can potentially allow bad actors to attempt to game + the reputation algorithm (in their favor, or for a competitor in their + disfavor), and even the user agent itself could use this algorithm to + unfairly favor certain proprietary origins. The technical difficulty + consists of designing an algorithm that is fair and precise. It needs to + have a precision comparable to the precision of the `` element + signal of user intent. +1. A heuristic could be used to allow recovering from a blocked permissions + state based on various aspects of the user interaction on the site, previous + user action history, time since permission has been blocked, etc. However it + is very unlikely that the precision of such a heuristic would get even close + to the direct signal raised by the user's interaction with the + `` element. The usefulness of an unpredictable heuristic that + "sometimes" allows recovery makes for a bad developer and user experience. + +### Implementing an origin based permission allow list registry + +An allow list registry could be created allowing well behaved origins to request +a review and once authorized the behavior of the Permission API could be +modified to allow the user to change previous permission decisions. + +Advantages: + +1. No change to HTML standards required. The allow list simply changes the + behavior of the permission API on certain origins. + +Disadvantages: + +1. Low effectiveness at scale and bias towards larger, better known origins. + The vast number of origins on the internet ensures that most origins could + not be reviewed. Many long tail sites offering genuine user value and + applying best practices, which might nominally qualify, would be excluded or + face long waiting periods despite implementing best practices. +1. Faulty reviews. This system would depend not only on an unbiased review + system (a tremendously difficult problem), but also on the ability of the + reviewer to detect cloaking behaviors that could lead to an incorrect allow + list approval. Sites could also change in design at any point, such as new + site ownership, and there is no practical way to signal to the allow list + registry that a fresh review was needed. +1. Cost. A system of allow listing origins would be a significant ongoing + operational expense, including a review and appeals process. Many user + agents would be excluded from being able to implement such a system. +1. Consistency. Different user agents would likely have their own allow list + mechanism resulting in inconsistent best practice guidance to developers and + headaches navigating the constraints of the allow list review process. ## Extending the PEPC in the future From 8bec92bc08cd3dfdbd43a37d8373413d28397261 Mon Sep 17 00:00:00 2001 From: andypaicu Date: Wed, 24 Jan 2024 12:16:49 +0100 Subject: [PATCH 2/3] Update explainer.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Marcos Cáceres --- explainer.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/explainer.md b/explainer.md index 55e86bf..6beaa97 100644 --- a/explainer.md +++ b/explainer.md @@ -24,9 +24,10 @@ There are four main challenges with the status quo: and abuse mitigation approach has an architectural upper bound on user protection because the model relies on the website to choose when to trigger the permission request prompt rather than capturing a reliable signal of - user intent. Requiring a user gesture for the Permission API (or similar) + user intent. Requiring a user gesture to [request permission to use a powerful feature](https://www.w3.org/TR/permissions/#dfn-request-permission-to-use) (or similar) does not solve this problem as there are many ways of tricking a user into - providing such a gesture. + providing a so called "[activation triggering input event](https://html.spec.whatwg.org/#activation-triggering-input-event)" + (i.e., a user gesture, such as clicking the mouse or pressing a key) . 1. **Context**: Ideally, a site's developer will request access as part of a contextual flow that helps users understand what's being asked for and why, From 5cca9ca1812507b074cbfaf9ab9b2c6311719297 Mon Sep 17 00:00:00 2001 From: andypaicu Date: Wed, 24 Jan 2024 12:16:56 +0100 Subject: [PATCH 3/3] Update explainer.md MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Marcos Cáceres --- explainer.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/explainer.md b/explainer.md index 6beaa97..277c2b3 100644 --- a/explainer.md +++ b/explainer.md @@ -20,7 +20,7 @@ However these measures have limited effect There are four main challenges with the status quo: -1. **Insufficiency of existing mitigations** The present day permissions spam +1. **Insufficiency of existing mitigations**: The present day permissions spam and abuse mitigation approach has an architectural upper bound on user protection because the model relies on the website to choose when to trigger the permission request prompt rather than capturing a reliable signal of