Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Communication channel from page scripts to the automation client #157

Closed
sadym-chromium opened this issue Dec 7, 2021 · 13 comments
Closed
Labels
enhancement New feature or request module-script Script module

Comments

@sadym-chromium
Copy link
Contributor

Extract the discussion from #144 (comment).

@foolip
Regarding postMessage and CDP's Runtime.addBinding, should we have something like Runtime.addBinding in BiDi that could also be used to create a message channel between the original script execution context and the BiDi client? postMessage is in a way a more specialized form of Runtime.addBinding, that only works in sandboxes.

Disclaimer: I'm a novice to some of this, high risk that I'm misunderstanding.

@sadym-chromium
I'm not sure yet, what is the SandboxGlobalScope for CDP Page.createIsolatedWorld is, but I'd assume it creates the new instance of Window wrapper. If so, it would be quite difficult to implement custom global for sandbox.

Regarding postMessage, we can add an optional parameter bindings to provide postMessage or any other bindings user likes.

@jgraham
Note that Web Extensions content scripts already use the same sandbox/isolated world concept and provide additional APIs on the global object, so I'm not convinced it's impossible to do that. But maybe the global is itself the SandboxWindowProxy or whatever we call it (i.e. the same object as window). I should check.

@jgraham

Regarding postMessage and CDP's Runtime.addBinding, should we have something like Runtime.addBinding in BiDi that could also be used to create a message channel between the original script execution context and the BiDi client? postMessage is in a way a more specialized form of Runtime.addBinding, that only works in sandboxes.

Well they're a bit different. addBinding creates a named function in one or more globals. Presumably the function is them visible to content scripts, so you have to be careful to avoid a name clash. It also only takes a string argument, so you don't get any protocol level serialization of argument values. In practice Puppeteer seems to use this in content scripts to implement an "evaluate in client" feature (i.e. to allow calling a client defined function which does something outside the browser sandbox and and passes the result back to the browser).

It's kind of unclear to me that setup where the code running in the page is directly calling a global binding is a good idea. For a similar use case I think I'd define something like

WebDriverMessagePort {
    postMessage(any message)
}

which wouldn't be by default exposed in any global, but would be seriazable over the protocol as something like {type: "webdriverport", value: {id: <someId>}}. Then you could do script.callFunction((port) => {/*do whatever with the port */}, args=[{type: "webdriverport", value: {id: 1}}])which would allow you to do whatever you want with the port, including create a function on the global that forwards its argument to the port, which would basically re-implement addBinding. We'd still need the ability to automatically execute scripts during the initial page load, but I think we want that anyway.

@jgraham
So, regarding the global, it does seem like the global should be the SandboxWindowProxy or whatever we're calling it. So perhaps it makes sense to put the WebDriver API in a webdriver object i.e. we'd have webdriver.postMessage since otherwise it conflicts with Window.postMessage. Or the other option would be to avoid having any special APIs in the global scope and only do the "pass in a WebDriverMessagePort" suggestion above. But that seems less convenient for users.

@sadym-chromium
I like the idea: instead of adding global bindings, pass bindings as deserializable arguments to script.callFunction. What I don't understand is why do we need the whole new interface for that? The argument can be something like:

{
  type: "binding", 
  value: "SOME_BINDING_NAME"
}

, which will be deserialized to a callback. Calling that callback with any arguments will cause the BiDi event with those arguments like:

{
  method: "script.bindingCalled",
  params: {
    arguments: [... remote values ...],
    name: "SOME_BINDING_NAME"
  }
}
@sadym-chromium sadym-chromium changed the title Communication from user script to client. Communication channel from page scripts to the automation client Dec 7, 2021
@jgraham
Copy link
Member

jgraham commented Dec 7, 2021

It feels like it should still be a platform object that implements an interface (even if that's a callback interface) rather than it just being a magic function. But yes, I agree the general plan there would work and avoids the need to figure out sandbox-specific IDL for now.

@sadym-chromium sadym-chromium added enhancement New feature or request module-script Script module labels Dec 7, 2021
@jgraham
Copy link
Member

jgraham commented Dec 7, 2021

Er not callback interface but https://webidl.spec.whatwg.org/#dfn-callback-function perhaps.

@jgraham
Copy link
Member

jgraham commented Dec 7, 2021

At risk of bikeshedding, arguably the term "binding" is misleading in this model because it's not bound to anything really. It makes more sense to me to have the event called script.callback and the identifying property called id rather than name since it's really an identifier. But other than that the shape of the event looks good to me.

@css-meeting-bot
Copy link
Member

The Browser Testing and Tools Working Group just discussed Communication channel from page scripts to the automation client..

The full IRC log of that discussion <AutomatedTester> topic: Communication channel from page scripts to the automation client.
<AutomatedTester> github: https://github.com//issues/157
<AutomatedTester> sadym: The first topic wanted to discuss was around what James brought up around the comms channel
<AutomatedTester> ... there is an example in the discussion linked
<jgraham> q+
<AutomatedTester> ... [discusses the example] and James had some ideas that I liked
<AutomatedTester> ack jgraham
<AutomatedTester> jgraham: To put this into context, we discussed to have a sandbox value and [discusses generation of custom events]
<AutomatedTester> ... this would an alternative to the other way as we can have a callback that is called
<AutomatedTester> ... this will be simpler from spec prose as we don't have to get webidl to do things it doesnt' normally do
<AutomatedTester> ... one concern that could be there is "what happens to scripts that are loaded on page load". It's not going to be impossible just going to be "fun"
<AutomatedTester> ... I will write on the issue for deeper discussion
<AutomatedTester> q?

@bwalderman
Copy link
Contributor

In the current proposal it looks like the binding (callback? port?) object on the local end allows one-way communication from the browser to the client only. If the client wants to send a message to a page script or sandbox script running in the browser, I suppose they can always use script.execute/script.callFunction but should we consider making the binding object bidirectional? Having a common object to send/receive messages in either direction does seem more elegant from a user perspective although it would complicate the API somewhat.

@bwalderman
Copy link
Contributor

Let's say the client creates a binding object using a local value as proposed below:

{
  type: "binding", 
  value: "SOME_BINDING_NAME"
}

For other local value types, the client that creates them doesn't assign any identity to them so it doesn't need to care about cleanup. "Binding" values (or whatever we call them) would be a little different from these other local value types though, because the client must have some way to ensure the corresponding platform object in the javascript runtime is kept alive. So I think we'll want to keep a strong ref on any binding objects that are created and have a command that clients can use to free that ref when they are done.

@jgraham
Copy link
Member

jgraham commented Dec 9, 2021

At least for one-way communication, I don't see the resource management problem. On the local (client) side you do something like (pseudocode):

// In the background this registers a listener for `script.callback` events, and handles any that
// match the randomly generated id
let target = new RemoteEventTarget();
// target is serialized as {type: "callback", value: {"id": <some string e.g. a uuid>}}
remoteSandbox.executeScript(callback => {
  /* This bit executes in the browser */
  // Forward all messages to the client
  document.addEventListener("message", event => callback(event.data), false);
}, [target]);
// Have some way to handle the actual message data
target.onmessge = handleRemoteMessage;

So the browser and the client both manage their own objects, and the lifetimes aren't tied.

@bwalderman
Copy link
Contributor

Ah, my mistake. I guess resource management is less of a problem for one-way communication. If there are no more references to the browser object then obviously the browser can't send any more messages to the local (client) end and the local end doesn't need to do anything.

I do think there should be a way for the local end to stop receiving messages though. CDP has Runtime.removeBinding to go with Runtime.addBinding. According to the docs, this doesn't remove the binding function from the page's global object, it just stops firing notifications on the client side. If we were to make a corresponding command in BiDi, I think the most reasonable behavior would be to break the connection and stop firing events on the local end. The browser side callback function would then become a no-op.

@jgraham
Copy link
Member

jgraham commented Dec 13, 2021

One issue that came up was how to handle the case of scripts that are configured to start automatically when a new context is created. We would have to have some way to provide the callback functions to those scripts, whereas with a postMessage API that just exists in the sandbox there's no requirement to pass in arguments.

@bwalderman
Copy link
Contributor

I agree that a postMessage API exposed on the sandbox global is necessary for bootstrap scripts so that they can start posting events to the client immediately. I'd like to try and reconcile that with the callback-passing approach being discussed in this issue since both approaches are valuable. The callback-passing proposal is useful so that CDP clients using Runtime.addBinding/removeBinding have a BiDi API to transition to.

I think it makes sense to align both of these approaches and fire the same kind of event when either postMessage or a deserialized callback object is called. At the risk of even further bikeshedding, I'd suggest naming this new event script.callbackCalled for consistency with other event names which are "noun-verb".

ScriptCallbackCalledEvent = {
    method: "script.callbackCalled",
    params: {
        id: text,
        realm: Realm,
        args: [RemoteValue]
    }
}

For events generated by callback objects passed from the client, the id will be the id that the client provided. For events generated by a sandbox script calling postMessage, we could require clients to specify a callback identifier when they first create a sandbox and then use this identifier as the id in future events.

I also realized that a command similar to Runtime.removeBinding won't really be necessary as I previously thought. If CDP-style addBinding can be implemented by taking a callback object and exposing it as a global property, then removeBinding can be implemented by replacing that function with a no-op.

@css-meeting-bot
Copy link
Member

The Browser Testing and Tools Working Group just discussed Communication channel from the browser to the client.

The full IRC log of that discussion <foolip> topic: Communication channel from the browser to the client
<foolip> github: https://github.com//issues/157
<foolip> sadym: there are a few approaches here
<foolip> sadym: we could make it bi-directional, or just one direction with a binding
<foolip> sadym: bindings would trigger an event on the BiDi client side.
<foolip> q?
<jgraham> q+
<foolip> foolip: this question was previously tied up with sandbox scripts and communicating with them. with the bindings approach, will we talk to sandbox scripts and content scripts in the same way?
<foolip> sadym: yes, and if the client wants to talk to multiple sandboxes they have to implement that.
<jgraham> ack jgraham
<foolip> foolip: I agree that this is a good starting point, with just a way to emit an event.
<foolip> foolip: there are questions about how exactly bindings should work still
<foolip> jgraham: an earlier suggestion was that sandboxes get a postMessage. the new suggestion is bindings which is a method you can invoke. that works in both sandbox and content scripts.
<foolip> jgraham: an issue we discussed last time is that once we have bootstrap scripts, we won't have the chance to pass in that binding. then maybe we need to pass it in when we configure the page load script. or an interface that exists only for page load scripts.
<foolip> jgraham: but Brandon's proposal in the GitHub issue was to make the protocol level bits look the same regardless of the API
<foolip> jgraham: at the protocol layer we seem to have broad agree, possible apart from bikeshedding the name,
<foolip> jgraham: there's a question about what it should look like in the API
<foolip> foolip: what is API and protocol in this discussion?
<foolip> jgraham: the API is what an injected script sees
<foolip> sadym: scripts initiated by a sandbox are seen by the client too? (scribe didn't get it all)
<foolip> jgraham: for bootstrap scripts, we could pass in a function to support arguments (which is how bindings are passed)
<foolip> foolip: ok, so you'd pass in a function to establish the binding, similar to how you now do it post-load for regular scripts?
<foolip> jgraham: yes
<foolip> q?
<foolip> foolip: there was previously discussion about what a binding looks like. Is a binding a function which if invoked does the magic to emit the event?
<foolip> jgraham: yes
<foolip> Agenda is here: https://www.w3.org/wiki/WebDriver/2022-01-BiDi

@jgraham
Copy link
Member

jgraham commented Sep 16, 2022

#65 (comment) contains some content relevant to this.

@jgraham
Copy link
Member

jgraham commented Apr 12, 2023

I just landed #361 so I think we can close this, and use different issues for any future work in this area.

@jgraham jgraham closed this as completed Apr 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request module-script Script module
Projects
None yet
Development

No branches or pull requests

4 participants