Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BroadcastChannel spec is vague about asynchronous nature of global list management #7267

Open
wanderview opened this issue Oct 27, 2021 · 10 comments

Comments

@wanderview
Copy link
Member

wanderview commented Oct 27, 2021

Currently the BroadcastChannel spec is written such that the constructor is just a couple of synchronous steps:

https://html.spec.whatwg.org/#dom-broadcastchannel

And then in the postMessage algorithm it claims to synchronously get a list of all BroadcastChannels through the browser. See step 4 here:

https://html.spec.whatwg.org/#dom-broadcastchannel-postmessage

This is rather impractical since multi-process browser architectures pretty much require asynchronous operations to get this list (or performance kill synchronous IPC).

What's more, browsers actually do the async step in the constructor:

I think its fair to say browsers do this for performance. To collect the list for every postMessage() would involve messaging every process in the browser for each message. So instead browsers register a list of BC objects in a central place using IPC instead.

There are some subtle differences between the spec and the "register asynchronously upfront" approach.

Consider:

  1. Worker thread A creates BC1 with channel name "foo".
  2. Worker thread A calls self.postMessage() to tell its parent frame that its ready.
  3. Parent frame B creates BC2 with channel name "foo".
  4. Parent frame B calls BC2.postMessage().

Is BC1 guaranteed to receive the message?

According to the spec, yes. Because BC1 was constructed before BC2.postMessage() was invoked.

In chrome, however, separate threads message the browser process to register in the central BC list using separate IPC message queues. That means BC2 can actually construct and post its message before BC1 has registered in the list. This means there is a race condition that determines if the message is received.

We have not observed this race in firefox, though. I believe this is likely because its PBackground IPC mechanism uses a single pipe for all messages from one process to another. I believe the race would exist if the worker thread A were a SharedWorker or ServiceWorker in a separate process, though.

What is the intended behavior?

The spec says one thing, but its wildly different from implementations and its unclear if its implications were intended. Similarly, its unclear firefox's implementation is based on this nuance vs as a side effect of its IPC implementation, and it likely doesn't provide the guarantee in multi-process scenarios.

Chrome's implementation is based on the assertion that BC ordering is only guaranteed within a single thread; e.g. a synchronously scriptable set of frames, a single worker thread, etc.

Are we attempting to force guarantees across threads? And if so, how do we do that if thread A and thread B are in different processes? We can't have the same IPC pipe in the case and the spec's style of implementation would be terrible for performance.

I would like to suggest we not require cross-thread ordering of setup guarantees.

In the meantime we are likely to land some WPT tests to deflake some of the cases that hit this condition. I hope no one objects to that.

Thoughts? @mkruisselbrink @domenic @annevk @asutherland @cdumez

@wanderview
Copy link
Member Author

cc @recvfrom, who did all the hard work to track down this race condition.

@asutherland
Copy link

I would like to suggest we not require cross-thread ordering of setup guarantees.

Yes, this seems like the only sane course of action. (And any accidental invariants Firefox's current IPC implementation are providing are potentially going to change in the medium term future.)

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Oct 27, 2021
This CL adds tentative WPTs that test whether BroadcastChannels in
detached iframes and closed workers behave according to proposed
changes to the specification. For more details, see:

whatwg/html#7219

Note that the WPT CI found the existing closing/re-opening test
(and two of the new tests) to be flaky in Chrome. From
investigating further it seems there's a race condition when
instantiating and sending BroadcastChannel messages between
a worker and its parent such that using postMessage from either
side to indicate BroadcastChannel readiness is insufficient.
This CL also adds extra logic to mitigate this in the existing
test case. For more info, see:

whatwg/html#7267

Bug: 1239274
Change-Id: Ia014e412a2dc8267aac57041672835fb90994d89
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Oct 27, 2021
This CL adds tentative WPTs that test whether BroadcastChannels in
detached iframes and closed workers behave according to proposed
changes to the specification. For more details, see:

whatwg/html#7219

Note that the WPT CI found the existing closing/re-opening test
(and two of the new tests) to be flaky in Chrome. From
investigating further it seems there's a race condition when
instantiating and sending BroadcastChannel messages between
a worker and its parent such that using postMessage from either
side to indicate BroadcastChannel readiness is insufficient.
This CL also adds extra logic to mitigate this in the existing
test case (and the new ones). For more info, see:

whatwg/html#7267

Bug: 1239274
Change-Id: Ia014e412a2dc8267aac57041672835fb90994d89
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Oct 28, 2021
This CL adds tentative WPTs that test whether BroadcastChannels in
detached iframes and closed workers behave according to proposed
changes to the specification. For more details, see:

whatwg/html#7219

Note that the WPT CI found the existing closing/re-opening test
(and two of the new tests) to be flaky in Chrome. From
investigating further it seems there's a race condition when
instantiating and sending BroadcastChannel messages between
a worker and its parent such that using postMessage from either
side to indicate BroadcastChannel readiness is insufficient.
This CL also adds extra logic to mitigate this in the existing
test case (and the new ones). For more info, see:

whatwg/html#7267

Bug: 1239274
Change-Id: Ia014e412a2dc8267aac57041672835fb90994d89
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3229181
Commit-Queue: Andrew Williams <[email protected]>
Reviewed-by: Ben Kelly <[email protected]>
Cr-Commit-Position: refs/heads/main@{#935672}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Oct 28, 2021
This CL adds tentative WPTs that test whether BroadcastChannels in
detached iframes and closed workers behave according to proposed
changes to the specification. For more details, see:

whatwg/html#7219

Note that the WPT CI found the existing closing/re-opening test
(and two of the new tests) to be flaky in Chrome. From
investigating further it seems there's a race condition when
instantiating and sending BroadcastChannel messages between
a worker and its parent such that using postMessage from either
side to indicate BroadcastChannel readiness is insufficient.
This CL also adds extra logic to mitigate this in the existing
test case (and the new ones). For more info, see:

whatwg/html#7267

Bug: 1239274
Change-Id: Ia014e412a2dc8267aac57041672835fb90994d89
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3229181
Commit-Queue: Andrew Williams <[email protected]>
Reviewed-by: Ben Kelly <[email protected]>
Cr-Commit-Position: refs/heads/main@{#935672}
FrankEnderman pushed a commit to FrankEnderman/cursemium that referenced this issue Oct 28, 2021
This CL adds tentative WPTs that test whether BroadcastChannels in
detached iframes and closed workers behave according to proposed
changes to the specification. For more details, see:

whatwg/html#7219

Note that the WPT CI found the existing closing/re-opening test
(and two of the new tests) to be flaky in Chrome. From
investigating further it seems there's a race condition when
instantiating and sending BroadcastChannel messages between
a worker and its parent such that using postMessage from either
side to indicate BroadcastChannel readiness is insufficient.
This CL also adds extra logic to mitigate this in the existing
test case (and the new ones). For more info, see:

whatwg/html#7267

Bug: 1239274
Change-Id: Ia014e412a2dc8267aac57041672835fb90994d89
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3229181
Commit-Queue: Andrew Williams <[email protected]>
Reviewed-by: Ben Kelly <[email protected]>
Cr-Commit-Position: refs/heads/main@{#935672}
@annevk
Copy link
Member

annevk commented Oct 28, 2021

See also #5327.

cc @gterzian

@wanderview
Copy link
Member Author

wanderview commented Oct 28, 2021

I would like to suggest we not require cross-thread ordering of setup guarantees.

Yes, this seems like the only sane course of action. (And any accidental invariants Firefox's current IPC implementation are providing are potentially going to change in the medium term future.)

Another option we could consider (although I don't like it), would be to use sync IPC for the setup of BroadcastChannel.

This would be a bit nicer for web developers because they wouldn't have to reason about this kind of setup race, etc. Of course the downside is that sync IPC is bad for performance, but at least it would only be once at construction time.

@annevk

This comment has been minimized.

@wanderview

This comment has been minimized.

@annevk

This comment has been minimized.

@wanderview

This comment has been minimized.

@asutherland
Copy link

I think the good news is that if the BC tests are having coordination problems, https://github.com/web-platform-tests/rfcs/blob/remote_channel/rfcs/remote_channel.md (prototype PR at web-platform-tests/wpt#2980) can probably help.

I'm strongly opposed to adding a (de facto) requirement to use Sync IPC without an extremely good reason, which I don't think we have. And even then I think we'd want to weigh whatever that reason is against consultation with security experts on whether the exposure of a sync IPC on Workers that provides a storage-key-wide total ordering would be a problem since it might somehow be used as a cross-agent-cluster timing gadget. While I think in many browsers Blob URL creation may already provide a means of inducing a sync IPC, the resulting Blob URL isn't something that be communicated over in a way that can help agents A and B establish an ordering.

(Note that I'm not saying there's definitely a viable security attack that could result. I'm just saying that the potential makes my head hurt and feels like it could result in a lot of interesting security papers that would in turn make my head hurt in the future.)

@wanderview
Copy link
Member Author

Ok. Happy to leave ordering undefined across threads.

moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Oct 29, 2021
…iframes, closed workers, a=testonly

Automatic update from web-platform-tests
BroadcastChannel: Add WPTs for detached iframes, closed workers

This CL adds tentative WPTs that test whether BroadcastChannels in
detached iframes and closed workers behave according to proposed
changes to the specification. For more details, see:

whatwg/html#7219

Note that the WPT CI found the existing closing/re-opening test
(and two of the new tests) to be flaky in Chrome. From
investigating further it seems there's a race condition when
instantiating and sending BroadcastChannel messages between
a worker and its parent such that using postMessage from either
side to indicate BroadcastChannel readiness is insufficient.
This CL also adds extra logic to mitigate this in the existing
test case (and the new ones). For more info, see:

whatwg/html#7267

Bug: 1239274
Change-Id: Ia014e412a2dc8267aac57041672835fb90994d89
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3229181
Commit-Queue: Andrew Williams <[email protected]>
Reviewed-by: Ben Kelly <[email protected]>
Cr-Commit-Position: refs/heads/main@{#935672}

--

wpt-commits: de06181503dc24462beb8b1e5cf750ce382c27cf
wpt-pr: 31290
jamienicol pushed a commit to jamienicol/gecko that referenced this issue Nov 1, 2021
…iframes, closed workers, a=testonly

Automatic update from web-platform-tests
BroadcastChannel: Add WPTs for detached iframes, closed workers

This CL adds tentative WPTs that test whether BroadcastChannels in
detached iframes and closed workers behave according to proposed
changes to the specification. For more details, see:

whatwg/html#7219

Note that the WPT CI found the existing closing/re-opening test
(and two of the new tests) to be flaky in Chrome. From
investigating further it seems there's a race condition when
instantiating and sending BroadcastChannel messages between
a worker and its parent such that using postMessage from either
side to indicate BroadcastChannel readiness is insufficient.
This CL also adds extra logic to mitigate this in the existing
test case (and the new ones). For more info, see:

whatwg/html#7267

Bug: 1239274
Change-Id: Ia014e412a2dc8267aac57041672835fb90994d89
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3229181
Commit-Queue: Andrew Williams <[email protected]>
Reviewed-by: Ben Kelly <[email protected]>
Cr-Commit-Position: refs/heads/main@{#935672}

--

wpt-commits: de06181503dc24462beb8b1e5cf750ce382c27cf
wpt-pr: 31290
Gabisampaio pushed a commit to Gabisampaio/wpt that referenced this issue Nov 18, 2021
This CL adds tentative WPTs that test whether BroadcastChannels in
detached iframes and closed workers behave according to proposed
changes to the specification. For more details, see:

whatwg/html#7219

Note that the WPT CI found the existing closing/re-opening test
(and two of the new tests) to be flaky in Chrome. From
investigating further it seems there's a race condition when
instantiating and sending BroadcastChannel messages between
a worker and its parent such that using postMessage from either
side to indicate BroadcastChannel readiness is insufficient.
This CL also adds extra logic to mitigate this in the existing
test case (and the new ones). For more info, see:

whatwg/html#7267

Bug: 1239274
Change-Id: Ia014e412a2dc8267aac57041672835fb90994d89
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3229181
Commit-Queue: Andrew Williams <[email protected]>
Reviewed-by: Ben Kelly <[email protected]>
Cr-Commit-Position: refs/heads/main@{#935672}
mjfroman pushed a commit to mjfroman/moz-libwebrtc-third-party that referenced this issue Oct 14, 2022
This CL adds tentative WPTs that test whether BroadcastChannels in
detached iframes and closed workers behave according to proposed
changes to the specification. For more details, see:

whatwg/html#7219

Note that the WPT CI found the existing closing/re-opening test
(and two of the new tests) to be flaky in Chrome. From
investigating further it seems there's a race condition when
instantiating and sending BroadcastChannel messages between
a worker and its parent such that using postMessage from either
side to indicate BroadcastChannel readiness is insufficient.
This CL also adds extra logic to mitigate this in the existing
test case (and the new ones). For more info, see:

whatwg/html#7267

Bug: 1239274
Change-Id: Ia014e412a2dc8267aac57041672835fb90994d89
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/3229181
Commit-Queue: Andrew Williams <[email protected]>
Reviewed-by: Ben Kelly <[email protected]>
Cr-Commit-Position: refs/heads/main@{#935672}
NOKEYCHECK=True
GitOrigin-RevId: 0be41ddeb37f733a7594164d753431694d4f53b2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants