-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't queue ICE candidates in tests (remove coupleIceCandidates) #19866
Comments
So what is the time when addIceCandidate should be called? I don't think I understand the issue well enough. I'm working on implementing an operations chain at the moment. I had SLD/SRD/createOffer/createAnswer in mind for this, but I should make use of the chain in addIceCandidate as well. |
"That is: addIceCandidate gets queued behind any pending setRemoteDescription." That would only be true if addIceCandidate is invoked before setRemoteDescription. Are we talking about the offer or answer? Different PCs have different operations chains, right? |
If I understand correctly, one must not invoke addIceCandidate when we are in 'stable' state, is that correct?
I might imagine the answer to 1) is "no" due to promise magic, and pc2.SRD happening immediately on pc1.SLD resolving, before onicecandidate has had a chance to fire? But what about 2)? Anyway, I see this comment:
There may possibly be a Chrome bug here, I'm not sure. The WebRTC implementation of SLD is "completely finish executing SLD on the webrtc signaling thread (including spinning up things on the network thread) and then, post a message to the signaling thread to invoke the callback to inform Chrome that SLD completed - which happens in a post to the javascript thread". I don't know how the order of messages are processed, but assuming the ICE agent acts on the webrtc network thread and not the webrtc signaling thread, it might be possible for it to generate ICE candidates and to post back to the webrtc signaling thread that ICE candidates are available, before SLD has had a chance to post to itself saying it is complete. So perhaps the following order is possible: "SLD, onicecandidate, SLD complete callback" The fact that SLD's callback is delayed is something I wanted to fix before, but I ended up breaking some dependent project, so I left it that way for backwards-compatiblity reasons. If this is actually causing ICE candidates to be signalled before SLD complete callback, we should definitely fix Chrome. |
@jan-ivar, @alvestrand |
If Chrome is missing the operations chain here, that would explain it!
Yes, it's required by the spec.
No, thanks to the chain, the state at time of invocation isn't important, order is. What matters is: one mustn't invoke addIceCandidate(candidate) before SRD(description) of the description the candidate belongs to. E.g this is fine: pc.setRemoteDescription(offer);
pc.addIceCandidate(candidate); // Invoked in "stable"; processed in "have-remote-offer" This is not fine: pc.addIceCandidate(candidate); // rejects with InvalidStateError
pc.setRemoteDescription(offer); Putting the above into this fiddle shows the problem. First, here's Firefox's output, which is to spec:
Compare to Chrome:
Here addIce succeeds before SRD, violating the spec. It also rejects with the wrong error. |
Speaking of operations queue, something else we should test, but currently only works in Firefox. |
"What matters is: one mustn't invoke addIceCandidate(candidate) before SRD(description) of the description the candidate belongs to." How do we guarantee that? Queueing candidates until SRD is complete accomplishes it, but is overkill and not required by spec; queueing candidates until SRD starts would be according to spec; ensuring that candidates aren't created until SRD starts would also accomplish it, but I don't see how any piece of the spec guarantees that; the two PCs aren't linked in any way. |
the following code is legal, but will (I think) always fail: pc1.oncandidate = (c) => pc2.setIceCandidate(c.candidate) there are multiple ways for someone to get the effect of the timeout without being very clear in his mind that this is going to happen. Using coupleIceCandidates() helps them not shoot themselves in the foot. Simpler-to-use helpers are better, when we're not testing the specific feature targeted by the test. |
My take:
Remember addTrack? Chrome implemented it supporting the no stream argument case, but that ended up not running well on other browsers. So we updated most of the tests to take a stream as an argument, so that all relevant behaviors would be exercised on all browsers by not failing due to missing stream argument, even though the tests were testing things that didn't need a stream. We put the "no stream" as a separate test. Proposal:
|
It doesn't have one, but it almost has one. It has a weird threading model that most of the time does the trick anyway. Basically, if you call SRD followed by addIceCandidate, this will ALWAYS execute SRD before addIceCandidate. Why? Both SRD and addIceCandidate are executed, one after the other, on the same thread. Sometimes we block JS, and sometimes we don't, but we always get the same order. Only when we have something cached in the JS layer do we truly avoid jumping threads. I think the problem here being that JavaScript becomes aware of SLD completing with a small delay, so when you do this:
You're kind of implicitly doing this really:
That is, if the unconfirmed bug is what I think it is. |
This is already guaranteed with naive signaling (no later than the SLD success callback): pc.onicecandidate = e => signaling.send(e.candidate);
pc.setLocalDescription(offerOrAnswer).then(() => signaling.send(pc.localDescription)); This guarantees the following order on the signaling channel (e.g. with 3 candidates):
...in both directions:
This in turn guarantees correct processing order on the receiving end, using obvious signaling: signaling.onmessage = async ({data: {description, candidate}}) => {
if (description) {
await pc.setRemoteDescription(description);
...
} else if (candidate) {
await pc.addIceCandidate(candidate);
}
} I.e. guaranteeing the following invocation order:
I.e. addIceCandidate(candidate) never happens before the related SRD(description). |
The above was in response to @alvestrand's "How do we guarantee that?" |
I'm happy to have identified the Chrome bug. Let's get that fixed, and then remove the helper that wallpapered over it for a year. As to the benefits of helpers, that's subjective. I'm not against all helpers. I'm sympathetic to not having every test fail on behavior Y—the above is my compromise here—but my experience is that asynchronous helpers are problematic when done wrong, basically when they obfuscate asynchronous spec APIs. See #19890. |
For those, why not use the spec's abstraction for negotiation: perfect negotiation? Option 3 in #19890 (comment). |
I've documented the problem here: |
coupleIceCandidates is a wpt webrtc helper with this comment:
I see no spec support for needing to queue ice candidates to avoid intermittents. AFAIK this is only a problem in Chrome. If I'm wrong, and there's a spec flaw, I'm happy to be shown it, but no-one has challenged my assessment in #12028 (comment) in almost a year.
We should remove this function and use the existing exchangeIceCandidates throughout. WPT tests should not work around Chrome bugs, because doing so hampers progress, which seems evident here: Chrome has had a year to look into and fix this bug, but haven't. If people in the field need to do similar tricks to make Chrome work reliably, then this can clearly hurt perceptions of the API and thus adoption.
cc @alvestrand @henbos
The text was updated successfully, but these errors were encountered: