Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce "Automation" section #151

Merged
merged 1 commit into from
Dec 22, 2017
Merged

Conversation

jugglinmike
Copy link
Contributor

@jugglinmike jugglinmike commented Jun 20, 2017

(Rendered version available at https://bocoup.github.io/permissions/index.html#automation) Thanks to gh-152, previews are now created automatically

The goal of this patch is to support user-agent automation and application testing for integration with the Permissions API. It's based on a recent mailing list discussion and subsequent "brainstorming"
document
. There, we decided that it would be necessary to simulate user response to "pending" requests from the "request permission to use" and "prompt the user to choose" algorithms. However, the specification does not currently track these requests outside of the algorithms themselves. Automation introduces the need for two operations:

  1. Retrieving pending requests
  2. Request differentiation

Both of these could certainly be implemented in formal specification language, though this would involve modification of sections other than "Automation". In some ways, we are breaking new ground here, and I am reluctant to complicate this specification (or any other) with normative text that does not directly further the use case. So for this initial draft, I have erred on the side of minimalism. More specifically:

  1. I have implemented request retrieval with the text: "each 'request
    permission to use' operation that is awaiting user input" and "each 'prompt
    the user to choose' operation that is awaiting user input". Is that
    sufficient?
  2. I'm assuming that one the requests have been retrieved, they may be uniquely
    identified by the permission descriptor that was used to create them. My
    understanding is that although the UA may choose to present requests to the
    user in a number of different ways (i.e. consolidating parallel requests for
    distinct Bluetooth devices), each request must still be associated with a
    single permission descriptor as specified through the JavaScript API. When
    it comes to simulating user denial, this approach conflates "use" requests
    and "choose" requests under the assumption that a given permission
    descriptor will never describe both kinds of requests simultaneously. Is
    that accurate?

Separately, it looks as though there is no restriction on the value type for options available in "prompt the user to choose." Is this intentional? For now, the proposed automation language is similarly type-neutral.

...but these are just the potential problems that I've identified. I welcome feedback of any kind, of course!

cc @shs96c @foolip


Preview | Diff

@foolip
Copy link
Member

foolip commented Jun 20, 2017

When it comes to simulating user denial, this approach conflates "use" requests and "choose" requests under the assumption that a given permission descriptor will never describe both kinds of requests simultaneously. Is that accurate?

@guidou, as I understand it the getUserMedia prompt is a little bit complicated in Chrome, but the details elude me. Is it simultaneously a "use" and "choose" request, or how would you put it?

@foolip
Copy link
Member

foolip commented Jun 20, 2017

Beware that I may be confused, https://w3c.github.io/permissions/#prompt-the-user-to-choose isn't referenced by https://w3c.github.io/mediacapture-main/ and it looks like it might not be intended for choosing between devices but something else.

Copy link
Member

@foolip foolip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a bit of surface-level reviewing before realizing I'm not the best person to review the details of this. Can you try to find people working on WebDriver implementations for each vendor and ask for their review? At least one would be good.

index.bs Outdated
commands</a> for the [[WebDriver]] specification.
</p>

The <dfn>no such request</dfn> <a>error</a> has a [=response/status=] of 404
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is adding a line to the table in https://w3c.github.io/webdriver/webdriver-spec.html#handling-errors, but the link to https://fetch.spec.whatwg.org/#concept-response-status and no link to and no use of "JSON Error Code" (can't be linked?) made that less than clear. A table in the same style with a single row would work, but I'll leave that to your editorial discretion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WebDriver's extension mechanism does not currently provide an explicit means of specifying new "errors", so I am making this up as we go. I'll follow up with Simon in this pull request's main comment thread.

index.bs Outdated
<p>
For the purposes of user-agent automation and application testing, this
document defines the following <a>errors</a>, algorithms, and <a>extension
commands</a> for the [[WebDriver]] specification.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rendered as "[WebDriver]", I think missing a "!" to make it a normative reference and presumably fix the problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is expected. See, for example, the current rendering for the text [[clipboard-apis]].

<tbody>
<tr>
<th>HTTP Method</th>
<th><a lt="extension command prefix">Prefix</a></th>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will the resulting URLs look like? https://w3c.github.io/webdriver/webdriver-spec.html#protocol-extensions talks about "vendor-specific needs" and "Commands that are specific to a user agent are called extension commands", which isn't the case here.

Should we perhaps just be defining new commands in the style of other commands in the WebDriver spec with a "Path Template"? If the resulting URLs are the same it doesn't matter, but I suspect not, and it'd be a bit silly if you could tell from the URL where something was specified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the text is definitely slanted towards vendor-specific extensions, but the use case for web standards is more clear in the "Extensions" sub-section of "Design Notes":

WebDriver aims to allow other standards to provide extensions to support new
functionality, and to allow conformance tests that cannot be implemented
entirely in ECMAScript to be written in a vendor-neutral way.

I was also a little confused about URL/URI specification while writing this. Earlier today, I opened a couple pull requests that I hope will clear things up a bit:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still ironing out the details on this one--see w3c/webdriver#961 for the latest

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the above issue, I think what is here is good. The extension command prefix is "permissions" and the extension command name is "set". So the entire command will look like session/{:sessionId}/permissions/set which is in line with the example.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any point having /get and /set? It seems more in line with the rest of WebDriver to use GET and POST verbs towards the same URL template.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late response, I agree with @burg. It would be more simple to have the command be POST session/{:sessionId}/permissions
Also I think it might make sense to have a get permissions command here, but that can come later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve filed w3c/webdriver#1142 to fix the slant towards vendor-specific extensions. It is the intention to make extension commands a useful entry-point for other specifications to define commands that could be useful to enable testing.

The current prose is slanted towards vendor-specific extensions because that was the first use case we had for them, and this is the first specification that tries to hook into WebDriver. If the current method hooking into WebDriver are sub-optimal we can fix that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any point having /get and /set? It seems more in line with the rest
of WebDriver to use GET and POST verbs towards the same URL template.

This proposal does not include a /get endpoint, but I think your point about idiom remains.

The WebDriver specification requires that all extension commands be specified with a "prefix" and a "name", which are combined to form the extension command's URI template. When I authored this patch, I could only satisfy that requirement by specifying some string, so I chose "Set." In light of recent recommendations about the "session ID", I think we could circumvent this by specifying the "prefix" as "session/{session id}" and the "name" as "permissions". However, this doesn't really jive with the intent of the specification:

separating extension commands from other commands in order to avoid potential
resource conflicts with other implementations.

So I think we'll need to revise the WebDriver spec if we want a nicer URL. I've proposed that here w3c/webdriver#1170

@jugglinmike
Copy link
Contributor Author

@shs96c this patch defines a custom WebDriver "error", but (as Philip pointed out) it's a little awkward. Do you think the Protocol Extensions section should include some text to help specifications accomplish this?

@raymeskhoury
Copy link
Collaborator

In some ways, we are breaking new ground here, and I am reluctant to complicate this specification (or any other) with normative text that does not directly further the use case. So for this initial draft, I have erred on the side of minimalism.

Thanks, I like the approach of avoiding complicating the normative text for now because it already has several problems and they could be exacerbated if we do until they are resolved :)

Possibly naively I feel that the "proactive API" mentioned in the doc is the simplest thing to do. This is what we use in our LayoutTests in chrome and it works well. The doc mentions the downside that this is "Somewhat artificial in that it circumvents the prompting mechanism altogether.". Did you have in mind practical impacts of this. Does it just "feel bad" or does it mean that we can't test something that we otherwise would be able to test?

Beware that I may be confused, https://w3c.github.io/permissions/#prompt-the-user-to-choose isn't referenced by https://w3c.github.io/mediacapture-main/ and it looks like it might not be intended for choosing between devices but something else.

Media capture doesn't use a chooser paradigm, they just have separate permissions to gate access to camera, mic and speakers.

Thanks!

@jugglinmike
Copy link
Contributor Author

Can you try to find people working on WebDriver implementations for each
vendor and ask for their review? At least one would be good.

Of course! @thejohnjansen @andreastt would you mind taking a look?

@jugglinmike
Copy link
Contributor Author

@raymeskhoury thanks for the feedback!

Did you have in mind practical impacts of this. Does it just "feel bad" or
does it mean that we can't test something that we otherwise would be able to
test?

As I understand it, the "proactive API" would be specified to only modify an
environment settings object. In other words, it would have no effect on any
permission requests that were already awaiting user input.

Such an API seems to be just as able to reach all the normative text, so it may
be equally valid for the purposes of conformance testing.

The API does has some bearing on the way tests are authored, though. It makes
the "request permission" state non-recoverable. Application developers would
have to maintain tests dedicated solely to verifying that certain user actions
triggered prompting, but that went no further in an interaction flow. The
degree to which this matters will vary between developers and projects, but I
expect those that ascribe the Behavior-Driven Development will find it annoying
to implement tests described by, "When I click the 'Join' button, the
application displays the text 'Waiting' and nothing else happens."

I decided to try the so-called "reactive" API for this initial patch in light
of this and in deference to existing patterns in WebDriver (Simon shared the
following in that document you've read--I'm including it here just to make the
conversation easier to follow for others)

This most closely models the current webdriver commands. The example, the
User Prompt commands follow this style, and (obviously) so do the commands
for interacting with elements.

That said, I'm not an implementer. If we can agree on an alternative that feels
right for the various "drivers" and the spec editors, I'm happy to re-write any
or all of this patch.

@raymeskhoury
Copy link
Collaborator

in deference to existing patterns in WebDriver

Ah thanks for explaining. I didn't notice that comment in the doc. Thinking through things a bit more, I do feel that the "reactive" approach isn't entirely sufficient. It would be ok if UAs were mandated to show prompts at particular times, but they aren't always. So, for example, whether or not a prompt will be shown when navigator.requestMIDIAccess is called for the first time is not well defined (intentionally left up to the UA). The test writer would have no way to control this either, because they only have the ability to react to prompts if they are shown. So it feels like we would need a way for test writers to define this initial state in the first place anyway. In spec terms, basically setting a well-defined value for what the current "permission state of a PermissionDescriptor" will return.

I certainly have no expertise with WebDriver though and am happy for those patterns to inform the decision :) But these are just some thoughts.

@jugglinmike
Copy link
Contributor Author

The test writer would have no way to control this either, because they only
have the ability to react to prompts if they are shown. So it feels like we
would need a way for test writers to define this initial state in the first
place anyway.

Technically, they could write tests that accounted for this uncertainty, e.g.:

driver.click('.enable')
  .then(() => driver.grant({ name: 'midi', sysex: true }).catch(() => {}))
  .then(/* etc. */)
  .then(/* etc. */)
  .then(/* etc. */);

...and even though the pattern could be encapsulated in WebDriver language bindings themselves (e.g. a grantIfPrompted method), none of this is ideal.

I certainly have no expertise with WebDriver though and am happy for those
patterns to inform the decision

I appreciate that, but this detail about prompts being optional seems unprecedented, potentially limiting the relevance of those existing APIs. I'd love to hear from @shs96c again to see if he was aware of this and if it effects his recommendation.

(...and while I know this isn't the place to discuss it, I'm wondering why the specification is so permissive in this regard. It seems like the unpredictability we're discussing is more than just a challenge for automated tests--it could impact usability, too. Do you know if there is anything published online about the design decision?)

@raymeskhoury
Copy link
Collaborator

..and while I know this isn't the place to discuss it, I'm wondering why the specification is so permissive in this regard. It seems like the unpredictability we're discussing is more than just a challenge for automated tests--it could impact usability, too. Do you know if there is anything published online about the design decision?

I can try to dig up some discussion around it but it's definitely intentional. Permission UI is very hard and we need to leave lots of freedom for UAs to experiment right now. Prompts are far from an ideal solution and if UAs can find good ways to make permission decisions in other ways it will result in a better user experience.

@jugglinmike
Copy link
Contributor Author

Prompts are far from an ideal solution and if UAs can find good ways to make
permission decisions in other ways it will result in a better user
experience.

Oh! This is very useful context for me; I had assumed that prompts were
fundamental to the Permissions API. Given that this is not the case, then it
seems unwise to define a testing API which is geared around prompts, or really,
any user interaction at all. This makes the "proactive API" seem much more
fitting.

At first, I thought this style of API tended to subvert the "driver is a real
user" conceit. In other words: if users can't arbitrarily set permission status
at any time, then WebDriver shouldn't expose that functionality (since it would
encourage people to write tests that don't reflect real user flows). But that's
a faulty assumption. I can set my permissions for this domain via the Firefox
UI right now.

So consider me convinced. The "proactive API" is the most fitting solution.
I'll work on updating this patch accordingly (expecting it to become much
simpler).

@jugglinmike
Copy link
Contributor Author

Okay, I've re-written this patch to manipulate permission state directly. As
expected, this is much simpler! The functionality is exposed via a single
command, "Set Permission" whose URI template looks like this:

/session/{session id}/permissions/set

Given that this is the only command, the set part is kind of superfluous. Due
to the way the WebDriver extension mechanism is worded, though, we have to
specify some value there.

Alternatively, we could expose this as three commands:

/session/{session id}/permissions/grant
/session/{session id}/permissions/prompt
/session/{session id}/permissions/deny

...but this may be confusing, since I think most people would assume that
POSTing to prompt would cause a prompt to be displayed. And further, these
URLs sound like they would work as a means to interact with active prompts, but
(as discussed above), they would not. (Although maybe that can change--I've
opened gh-153 to discuss a modification that would make this "proactive" API
also work "reactively".)

@jugglinmike
Copy link
Contributor Author

I just pushed up a correction for a typo in the new section's example.

Copy link
Collaborator

@raymeskhoury raymeskhoury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks jugglinmike!

index.bs Outdated
<li>If |settings| is a <a>non-secure context</a> and
<code>|rootDesc|.{{PermissionDescriptor/name}}</code> isn't <a>allowed in
non-secure contexts</a>, return <a>error</a> with <a>error code</a>
<a>invalid argument</a>.</li>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this require that all the checks that are added in the permission state algorithm need to be duplicated here? Would it be better instead to try to hook in at step 3 or 4? In any case I think this is ok as a first cut.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, this is the only condition from the permission state algorithm that needs to be duplicated. When you write, "all the checks that are added in the permission state algorithm," I'm not sure what other checks you have in mind. I'm wondering if it's somehow possible for powerful features to define addition checks here. Or if maybe you are considering future modifications to the specification that introduce new checks. Would you mind elaborating a bit?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - I was just thinking of checks that might be added in the future. It's not a huge deal, just something to keep in mind.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a WebDriver spec point of view, it is confusing that "is a secure context" is something to decide about a settings object rather than a browsing context.

index.bs Outdated
<li>Interpret the value of |state| as <a>new information about the user's
intent</a>. If |allRealms| is true, this information applies to other
<a>realms</a> with the <a>same origin</a>.
<li>Return <a>success</a> with data `null`.</li>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that at least in Chrome it actually may be hard to return per-realm values at present. I'm also actually not sure that it's necessary when testing most things. I wonder if we should just default to all realms and add the behavior later if it turns out to be necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was shooting for completeness here; ideally, the WebDriver command can be used to simulate any user behavior. That said, it may not be worth the effort, since it seems exceedingly unlikely that an application would concern itself with this case (e.g. multiple simultaneous tabs, communicating via postMessage, and able to detect when/if only one received new information about the user's intent).

On the third hand, it might be safest to initially specify a solution that is complete as possible and leave subjective determination of "practically complete" to implementers. That way, if developer experience reveals that this is relevant, then the problem can be resolved without revisiting the standards process.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think completeness is a good thing too. You're right, I guess it's probably going to be ok for most tests if we don't fully implement the API and default to setting the permission for all realms even when that value is not true. The other option to consider would be making allRealms the default for the API which would encourage test writers to only specify to use an individual realm when it makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other option to consider would be making allRealms the default for the
API which would encourage test writers to only specify to use an individual
realm when it makes sense.

In terms of default behavior, I don't really have strong feelings one way or
the other. In terms of API design, though, I find optional arguments that
default to true a little counter-intuitive (since that conflicts with
JavaScript semantics regarding the "truthiness" of undefined). If we move
forward with the inverse behavior, I'd suggest naming the option something like
oneRealm so that the behavior for the "option not specified" case would map
to the false value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Chrome has trouble implementing a 1-realm permission grant
  • Firefox defaults to ephemeral grants, meaning the permission state stays "prompt" even if the user grants permission. If you override that, it becomes an all-realm grant. I don't know if they can do a non-ephemeral 1-realm grant.
  • Safari defaults to 1-realm grants, and if you override that it becomes an all-realm grant.

In order for the automation API to work, it has to select Firefox's persistent mode, so I think I'd vote for making all-realm the default. @jan-ivar, got any counter-proposals?

I agree that we shouldn't have optional arguments that default to true.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to read more to understand the realms concept. But based on what I have read here, I don't think it is necessary for testing most common user flows related to permissions. When a WebDriver test triggers getUserMedia() by executing script, the test knows what browsing context is active, and can change permissions values as appropriate if different behavior is desired per browsing context. A UA like Safari may have per-site permissions or global permissions, but this seems irrelevant since Automation sessions do not inherit any of the persistent UA state.

@jugglinmike
Copy link
Contributor Author

Hey @raymeskhoury,

It's been a little while since I received any feedback on this patch, so I'm wondering if you have any guidance about what I can do to help move it forward.

Thanks!

@raymeskhoury
Copy link
Collaborator

Sorry I was out of office for a while. The overall idea seems good to me but I'm not familiar with WebDriver and I think we would want @jyasskin to sign-off too.

One thing I'm wondering more generally about WebDriver (apologies if this is a naive question) but what would it look like to set the permission value from inside the test in JavaScript with the proposed API?

@foolip
Copy link
Member

foolip commented Aug 31, 2017

@raymeskhoury, @gsnedders is working on the infrastructure to wrap WebDriver APIs like this in web-platform-tests/wpt#6897. An example in that PR is test_automation.click(button), so this WebDriver API might be wrapped as test_automation.set_permission(name, value) perhaps.

@marcoscaceres
Copy link
Member

I'm still unclear how this will work cross-browser and version? What voodoo will you be using?

@marcoscaceres
Copy link
Member

/me spots driver.grant() in comments above.

@foolip
Copy link
Member

foolip commented Sep 14, 2017

@gsnedders, can you paste a snippet of how a test might end up looking and fill in the details on @marcoscaceres' "What voodoo will you be using?"

@guidou
Copy link

guidou commented Sep 14, 2017

@guidou, as I understand it the getUserMedia prompt is a little bit complicated in Chrome, but the details elude me. Is it simultaneously a "use" and "choose" request, or how would you put it?

Sorry for the late reply. The getUserMedia prompt in Chrome requests permission to use device types (cameras and/or microphones, depending on the request). It does not allow choosing a device.
There are no per-device permissions in Chrome for cameras and microphones.

index.bs Outdated
<li>If |allRealms| is neither true nor false, return <a>error</a> with
<a>error code</a> <a>invalid argument</a>.</li>
<li>Interpret the value of |state| as <a>new information about the user's
intent</a>. If |allRealms| is true, this information applies to other

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this is the line that actually intends one to change the permissions, specifically it says to add "new information about the user's intent". If I was implementing this as someone who knows WebDriver, but not permissions, I would have to dig through the permissions spec and figure out what this new information actually means and how I can update it. Perhaps be more specific and reference other parts of the API in this part?

@jugglinmike
Copy link
Contributor Author

Thanks for the review, @JKereliuk! The BikeShed tool renders that text as a hyperlink to the definition of the phrase in the output document. The definition reads:

New information about the user’s intent

The UA may collect information about a user’s intentions in any way its
authors believe is appropriate. This information can come from explicit user
action, aggregate behavior of both the relevant user and other users, or
other sources this specification hasn't anticipated.

I think this answers the question of "what this new information actually means," but since the build-time link generation is subtle, I'm not sure if (1) you were not aware of it, or (2) you were aware of it, but feel the referenced definition is inadequate. Could you let me know?

In either case, the definition definitely doesn't answer the question, "how can I update it?" As far as I know, though, this vagueness is intentional. I believe the specification editors chose loose language to give implementers more freedom.

@kereliuk
Copy link

For "what this new information actually means,", it was the second one, I am not immediately sure what action needs to be done from the definition; however, if that is the intention that resolves my worry with it!

@foolip
Copy link
Member

foolip commented Sep 22, 2017

@jyasskin, I think this is in pretty good shape now, can you give it a round of review and say what more is needed to merge this?

@gsnedders, can you paste an example of how this would be wrapped in wpt?

Copy link
Member

@jyasskin jyasskin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's reasonable to leave out prompt the user to choose for now. To handle that, we'll actually need to define the options that each permission offers in its prompts (e.g. bluetooth or USB devices, and in the future particular cameras, microphones, or speakers), and have the automated test pick out which one to accept.

index.bs Outdated
</tr>
<tr>
<td>POST</td>
<td>permissions</td>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://w3c.github.io/webdriver/webdriver-spec.html#dfn-extension-command-prefix says this should be a URI template, and gives the example of /session/{session id}/ms/edge.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am not understanding correctly, but in the specification is it not sufficient to just have the "permissions" as the URI.
That way in WebDriver implementations they can be /session/{session id}/vendorspecificuri/permissions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JKereliuk It's actually the consumer's responsibility to define that part of the URL. That wasn't clear when I wrote this patch back in June, and it's only since then that we've made the interaction explicit (see w3c/webdriver#956 and w3c/webdriver#962). So Jeffrey's request here makes sense to me; the current text is a relic of a issue we've since resolved in WebDriver.

index.bs Outdated
<ol>
<li>Let |permissionDesc| be the result of <a>getting the property</a>
"`descriptor`" from the |parameters| argument.</li>
<li>If |permissionDesc| is not a JSON object, return <a>error</a> with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm uncomfortable with WebDriver reserving the term "error" globally. Can you call this "WebDriver error" (by changing the text: error; url: dfn-error line at the top of the file)?

I'd also rather include the article between "return" and the thing: "return a WebDriver error" reads better than "return WebDriver error". This is inconsistently followed in https://w3c.github.io/webdriver/webdriver-spec.html.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jyasskin Would renaming WebDriver’s error definition to, say, WebDriver error or something better make WebDriver a better citizen for other specifications to hook into?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jyasskin I'm not sure I understand the concern here, since this is part of the WebDriver spec that has a specific error code; however, maybe its bad convention that WebDriver does this in its spec and that should be changed?

Also it looks like the link for error is broken, I think it should be this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure--for now, I''ll plan to implement this change and reference WebDriver's "error" term as "WebDriver error". I've also opened w3c/webdriver#1168 in the interest of making the WebDriver specification more ergonomic.

index.bs Outdated
<code>|rootDesc|.{{PermissionDescriptor/name}}</code>'s <a>permission
descriptor type</a>. If this throws an exception, return <a>error</a>
with <a>error code</a> <a>invalid argument</a>.</li>
<li>Let |settings| be the <a>current settings object</a>.</li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to pull the settings object out of the current session's current browsing context.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're right!

index.bs Outdated
<p>The <a>remote end steps</a> are:</p>
<ol>
<li>Let |permissionDesc| be the result of <a>getting the property</a>
"`descriptor`" from the |parameters| argument.</li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please indent continuation lines inside <li> elements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing

index.bs Outdated
<code>|rootDesc|.{{PermissionDescriptor/name}}</code> isn't <a>allowed in
non-secure contexts</a>, return <a>error</a> with <a>error code</a>
<a>invalid argument</a>.</li>
<li>Let |state| by the result of <a>getting the property</a> "`state`"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/by/be/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

index.bs Outdated
<a>error</a> with <a>error code</a> <a>invalid argument</a>.</li>
<li>Let |allRealms| be the result of <a>getting the property</a>
"`allRealms`" from the |parameters| argument.</a></li>
<li>If |allRealms| is `undefined`, let |allRealms| be false.</li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be simpler to write a WebIDL dictionary that parameters needs to be an instance of, and convert it all at once. That'd make it easy to default allRealms, typecheck state, etc.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reasons already described above, the "proactive" API approach means that UA does not need to show any [modal] prompts during automation. I highly prefer this to the "reactive" approach. From an implementor perspective, UA prompts are to be avoided if at all possible when running under automation.

So, given that there is no reason to display a UA prompt and no way to accept/deny the prompt with the proposed proactive command set, I don't understand what happens when the state is "prompt" and a request is made. I think "granted" and "denied" is sufficient.

index.bs Outdated
<li>Interpret the value of |state| as <a>new information about the user's
intent</a>. If |allRealms| is true, this information applies to other
<a>realms</a> with the <a>same origin</a>.
<li>Return <a>success</a> with data `null`.</li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Chrome has trouble implementing a 1-realm permission grant
  • Firefox defaults to ephemeral grants, meaning the permission state stays "prompt" even if the user grants permission. If you override that, it becomes an all-realm grant. I don't know if they can do a non-ephemeral 1-realm grant.
  • Safari defaults to 1-realm grants, and if you override that it becomes an all-realm grant.

In order for the automation API to work, it has to select Firefox's persistent mode, so I think I'd vote for making all-realm the default. @jan-ivar, got any counter-proposals?

I agree that we shouldn't have optional arguments that default to true.

index.bs Outdated
<li>If |allRealms| is `undefined`, let |allRealms| be false.</li>
<li>If |allRealms| is neither true nor false, return <a>error</a> with
<a>error code</a> <a>invalid argument</a>.</li>
<li>Interpret the value of |state| as <a>new information about the user's
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We indirect a lot of the permission API stuff through "new information about the user's intent" in order to give UAs lots of freedom about how to infer the right permission state. However, in an automation API, we don't want to give that freedom, so I think you should just say

The UA must act as if it returned state from an invocation of typedDescriptor's permission state for settings at this point.

They may be better ways to word that, but I'm trying to hook into the "If there was a previous invocation of this algorithm ..." wording.

If allRealms is true, you'll post a task to all other realms with the same origin to do the same thing.

We'll have some trouble implementing this for the always-granted APIs like "midi", so you may want to let the UA explicitly reject the update.

@jyasskin
Copy link
Member

Also, could you fix the IPR issue described in https://labs.w3.org/hatchery/repo-manager/pr/id/w3c/permissions/151? Thanks.

@foolip
Copy link
Member

foolip commented Nov 12, 2017

@JKereliuk, can you summarize here the outcome of the discussion(s) at TPAC?

@kereliuk
Copy link

At TPAC we discussed that Permissions is probably not well suited in the native WebDriver spec at this point, but we want to work with the permissions WG to try and land something in this spec, (i.e. this PR).

@foolip
Copy link
Member

foolip commented Nov 13, 2017

@JKereliuk that sounds great! Did you also discuss any differences between what we have here and in SafariDriver?

@kereliuk
Copy link

This is the top level description from @burg

@marcoscaceres
Copy link
Member

@AutomatedTester, you are probably in the best position to make a determination if we (Mozilla) can support what is being proposed here in Gecko driver (I've never even used Selenium - even had to Google how to spell it!).

Could you possibly take a look at what's being proposed and see if it's sane and something we could support in Gecko?

@kereliuk
Copy link

@AutomatedTester ^^^ friendly ping :)

@jugglinmike
Copy link
Contributor Author

Hi @shs96c, @AutomatedTester, and @burg!

I'm looking to move forward with this patch. To that end, I've been trying to catch up on the recent work done in WebKit and the
resultant conversation at TPAC
. I think that discussion has changed the group's preferred automation "style", but I'm having trouble understanding the resolution.

To recap, we've been considering three alternatives:

  • "passive," where WebDriver implements a "permissions handler" capability which declaratively states how it will respond to requests for permissions in a given WebDriver session
  • "reactive," where WebDriver commands influence pending requests for permission; commands are intended to be issued after the web application has requested permission
  • "proactive," where WebDriver commands set permission descriptors; commands are intended to be issued before the web application requests a permission

Reading the minutes, there seems to be some conflation of "reactive" approach with "UI interaction." Although I first thought in these terms, I no longer believe that this is strictly necessary. As I interpret it, the "reactive" approach could be implemented in the absence of any particular UI.

Separately, @jgraham (speaking today via IRC) has suggested that there are concerns that a "reactive" style would not map well in synchronous communication contexts. Unfortunately, James wasn't present for the whole discussion, and I can't find any notes on this.

Given that I initially proposed a "reactive" style and switched to the "proactive" style following feedback from @raymeskhoury, I'd like to get some clarity from those involved before making any specific changes. Could one of you folks could elaborate on the resolution at TPAC?

@jugglinmike
Copy link
Contributor Author

Thanks for all the feedback, @jyasskin. I've been a bit slow to implement it
this week because I've been concerned that this patch contains some fundamental
flaw from an implementer's or editor's perspective. After revisiting the TPAC
meeting minutes, reviewing @burg's design document, and checking in with
@AutomatedTester, I don't believe this is the case. I've incorporated most of
your advice, but there are a few things I could use clarification on:

They may be better ways to word that, but I'm trying to hook into the "If
there was a previous invocation of this algorithm ..." wording.

I'm trying this for now: "Interpret parameters.state as if it were the result
of an invocation of permission state for typedDescriptor with the argument
target made at this moment." But I think that we should probably formalize this
state.

If allRealms is true, you'll post a task to all other realms with the same
origin to do the same thing.

As I understand it, WebDriver algorithms execute outside of any particular
event loop. If that's correct, then I think task queuing is unnecessary. Also
note that in accordance with your other suggestion, this is now worded in terms
of environment settings objects (not realms). I've been a bit more formal about
how that set is defined.

We'll have some trouble implementing this for the always-granted APIs like
"midi", so you may want to let the UA explicitly reject the update.

I don't really understand this condition, but it seems as though it can be
enforced at the onset of this algorithm. I'd prefer to do it there if possible.
What do you think?

Also, could you fix the IPR issue described in
https://labs.w3.org/hatchery/repo-manager/pr/id/w3c/permissions/151? Thanks.

All set!

(By the way, your suggestion to use WebIDL for type checking was awesome! Thank
you for pointing that out.)

@AutomatedTester
Copy link

I don't think that reactive is possible and to be honest I don't think its really a model we want. I said as much at TPAC. My preference would be for a proactive too.

We were happy, again discussed at TPAC, for us to go with the strawman that @burg has done in SafariDriver.

@jugglinmike
Copy link
Contributor Author

Thanks for clearing that up, @AutomatedTester. Fortunately, both @burg's POC and the text being reviewed here use the proactive paradigm. I believe that we just need to iron out a few remaining formalisms. I'll be confirming that with Jeffrey tomorrow.

index.bs Outdated
text: extension command; url: dfn-extension-commands
text: extension command name; url: dfn-extension-command-name
text: extension command prefix; url: dfn-extension-command-prefix
text: getting the property; url: dfn-getting-a-property
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line isn't used anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Removed.

@jugglinmike
Copy link
Contributor Author

Hey @jyasskin: I've incorporated the changes we discussed on IRC today. Specifically:

  • Extend the step "Interpret parameters.state as if it were the result of an invocation of permission state" to occur inside a task (and update the algorithm to wait for all such tasks to complete)
  • Add a provision to allow the UA to initially reject the request if they can't honor it (for whatever reason)

I'm seeing now that this branch has a few conflict with master. They are limited to the Bikeshed "anchors" and "link-defaults" configuration and should be trivial to resolve. I'll hold off on rebasing until a reviewer requests that.

Anyway, thank you for your review! Could you recommend the next steps I should take to help move this patch along?

@jugglinmike
Copy link
Contributor Author

@foolip spoke with @mounirlamouri, and they agreed that the next step here is to rebase and resolve the conflicts, so I've done just that. Those conflicts were limited to the "anchors" and "link defaults" Bikeshed configuration and thus trivial to resolve.

The original state of this branch is available here.

@mounirlamouri
Copy link
Member

My understanding is that there were no editorial concerns from @marcoscaceres and @AutomatedTester comments were addressed by @jugglinmike or rather, the comments seemed to be non-issues.

It seems that @jyasskin reviewed the change and approves it. I only had a superficial look. Given the size of the change and in order to get moving, I'm merging this. Hopefully, concerns, if any, can be resolved in follow-ups.

Thanks for your contribution @jugglinmike :)

@jugglinmike
Copy link
Contributor Author

My pleasure!

jugglinmike added a commit to jugglinmike/permissions that referenced this pull request Jan 15, 2018
The WebDriver specification was recently updated to better-support
extension by web standards [1]. This change allows extension commands to
be defined without a dedicated "name". Because the Automation section
only defines a single command, the name "set" is superfluous. Remove it
in order to expose a more idiomatic HTTP API.

[1] w3c#151
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.