Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What should the shape of the API be? #35

Closed
pthatcherg opened this issue Dec 18, 2019 · 3 comments
Closed

What should the shape of the API be? #35

pthatcherg opened this issue Dec 18, 2019 · 3 comments
Labels
2020 Q1 Solve by end of Q1 2020

Comments

@pthatcherg
Copy link
Contributor

A number of considerations all combine in somewhat complex ways, such as (re)initialization, buffering, failure recovery, and flushing. Obviously we'd like a good API for all of these, but there are tradeoffs between different options. This issue is for tracking the discussion of what we want the API to look like.

Here are some options. Note that it may be possible to have different options for encode and decode. For example, we could do a combination of B for encode and D for decode.

Option A: new Encoder/Decoder for each change

Every time you want to change something that requires (re)initialization, such as changing the codec or resolution, create a new Encoder/Decoder. Also reinit every time a flush is desired.

Pros:

  • Simple API
  • It's clear when (re)init fails, and recovering is straightforward
  • If we have buffered frames, it's clear which initialization applies to which frames.
  • Flushing is just closing the .writable

Cons:

  • Dealing with downstream changes to the pipeline may be difficult (because you have a new .readable that you need to pipe somewhere).
  • Dealing with upstream changes to the pipeline may be difficult (because you have a new .writable that you need to pipe into).
  • It may be much more efficient to have only one encoder/decoder around at a time which is difficult to manage when the JS is creating new ones for every resolution change and/or flush.

Option B: An Initialize() method for each change

If a change requires a reinitialization, call Initialize(), as many times as you want. The .writable and .readable are stable.

Pros:

  • It's clear when (re)init fails, and recovering is straightforward
  • .readable is the same across multiple initializations, which makes the downstream consumption easier (nothing to re-pipe).
  • .writable is the same across multiple initializations, which makes the upstream production easier (nothing to re-pipe).
  • Reinitialization can be efficient because the implementation can keep only one encoder/decoder around at a time.

Cons:

  • Flushing mush be a separate method (since you can only call .close() on the .writable once)
  • If buffering is used, which initialization applies to which frames becomes unclear. If we don't buffer on the .writable, we can avoid this, but that means that we must require the JS to respect .ready and we must deal with what happens when it does not.

Option C: An Initialize() method that produces a new WritableStream

If a change requires a reinitialization, call Initialize(), as many times as you want. The .readable is stable, but not the .writable (if there is one).

Pros:

  • It's clear when (re)init fails, and recovering is straightforward
  • .readable is the same across multiple initializations, which makes the downstream consumption easier (nothing to re-pipe).
  • Reinitialization can be efficient because the implementation can keep only one encoder/decoder around at a time.
  • Flushing is just closing the WritableStream
  • If we have buffered frames, it's clear which initialization applies to which frames.

Cons:

  • Dealing with upstream changes to the pipeline may be difficult (because you have a new Writable that you need to pipe into).
  • It's not exactly a TransformStream any more
  • Transferring streams may be tricky

Option D: In-band parameters

To reinitialize, put new parameters on the chunk passed into the .writable. Init failure is conveyed via a write failure.

Pros:

  • For decode, the source of frames (such as a media container) is likely related to the decoding parameters desired, making this a convenient/natural fit.
  • Cleaner encoder/decoder API (no extra Initialize/Flush methods)
  • Transferring streams is likely less tricky
  • If we have buffered frames, it's clear which initialization applies to which frames.
  • Upstream and downstream piping is easy (since both .readable and .writable are stable)
  • Flushing is just closing the .writable

Cons:

  • (Re)init failure recovery isn't straightforward. You'll likely need to catch an exception on .pipeTo and probably want to use preventCancel.
  • For encode, the source of frames (such as a MediaStreamTrack tied to a VideoTrackReader) likely isn't related to the encoding parameters desired, making this an inconvenient fit.
  • More complex chunk types (not just EncodedVideoFrame and VideoFrame; more things need to be on there)

Option E: Internal reinitialization

Instead of asking for an init, just give it what you want and have it (re)init when it needs. There is a fine line between this and Option D. But consider resolution changes. Instead of specifying that the codec reinit with a new size, you just give it whatever frame comes from a MediaStreamTrack and it reinits based on that size. Similarly, an EncodedVideoFrame could simple express what codec it is and the decoder deals with whatever it is.

Pros:

  • API is easier to use, and simple
  • Transferring streams is likely less tricky
  • No problems with buffering and which settings apply to which frames
  • Upstream and downstream piping is easy (since both .readable and .writable are stable)
  • Flushing is just closing the .writable

Cons:

  • (Re)init failure recovery isn't straightforward. You'll likely need to catch an exception on .pipeTo and probably want to use preventCancel.
  • It's easy for performance issues to creap in easily and become too automative/implicit. For example, if a codec switch happens and the new codec is only available via software, not hardware, should we reinit internally from hardware to software? This leads to a higher level, more automatic API with constraints that are difficult to specify and keep consistent across browser, somewhat like getUserMedia, whereas this API was initially intended to be low-level and explicit about performance.
@sandersdan
Copy link
Contributor

sandersdan commented Feb 8, 2020

Based on our experience implementing VideoDecoder in Chromium, @chcunningham and I reviewed these options and we are implementing option D.

  • A (replacing streams every time): Replumbing is less elegant than in-band signaling in the cases we considered. Consider a potential app implementation of seeking:
demuxer.readFrom(time).then((config, readable) => {
  decoder.configure(config).then(writable => readable.pipeTo(writable));
});
  • B (out-of-band with same stream): Buffering is inherent in WritableStream, we cannot reliably synchronize in- and out-of-band signals without replacing streams.
  • C (out-of-band with new stream): Basically the same as A.
  • D (in-band): Using preventCancel and preventAbort, the whole pipeline does not need to be torn down on failure. Main downside is that streams contain multiple types of messages. Here the app implementation is much simpler because the configuration does not need to be separately plumbed:
demuxer.seek(time)
  • E (in-band with implicit configuration): Same as D with more complexity.

It still remains to be seen if preventAbort/preventCancel/signal solutions are intuitive enough to be required for first-time use of WebCodecs. It would be a bad outcome if apps always wrap decoders to provide a different interface.

@sandersdan
Copy link
Contributor

After thinking on this over the weekend, I think option A/C is worth considering further. If we provide a configure() that replaces the streams (and aborts the old ones), then clients can use preventAbort/preventCancel to polyfill any of the other options. This may be the safest path while we wait to see what happens with flush in streams.

@chcunningham
Copy link
Collaborator

Obsolete. See explainer updates (decouple from streams).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2020 Q1 Solve by end of Q1 2020
Projects
None yet
Development

No branches or pull requests

3 participants