Clarify that a pull source doesn't need to wait in pull() #1014

ricea · 2019-09-13T07:21:58Z

When a source blocks inside pull(), it is not possible to cancel the ReadableStream until the promise returned by pull() settles. In some cases pull() may never return.

However, it is not necessary to block in pull(). In many (most?) cases, returning an already-resolved promise from pull() and then enqueueing later to trigger the next call to pull() works fine.

This should be clarified in the description of underlying source.

The text was updated successfully, but these errors were encountered:

MattiasBuelens · 2019-09-13T11:09:07Z

I don't think we should discourage returning a promise. Imagine a ReadableStream that enqueues multiple chunks after each pull():

const rs = new ReadableStream({
  async pull(c) {
    await new Promise(r => setTimeout(r, 100));
    c.enqueue('a');
    c.enqueue('b');
  }
}, { highWaterMark: 2 });

After c.enqueue('a') is called, the stream does not immediately call pull() again since the promise from the previous pull is still pending. After the second enqueue, the queue is filled up to the HWM, so the stream no longer needs to call pull() again. As a result, this stream will call pull() only once at start-up, as expected.

Compare this to a version that does not return a promise:

const rs = new ReadableStream({
  pull(c) {
    (async () => {
      await new Promise(r => setTimeout(r, 100));
      c.enqueue('a');
      c.enqueue('b');
    })();
  }
}, { highWaterMark: 2 });

After c.enqueue('a') is called, the stream immediately calls pull() again, since it only has 1 chunk in its queue and it wants to fill up to HWM = 2. However, it does not know that we call c.enqueue('b') immediately afterwards, so we end up filling up past the HWM with 4 chunks in the queue.

This is probably a bit of a contrived example. But you could have a similar problem if you need to do some cleanup after each pull(), for example:

// imagine there's a hypothetical `source` somewhere
const rs = new ReadableStream({
  pull(c) {
    await source.startPulling();
    const chunk = await source.getNextChunk();
    c.enqueue(chunk);
    await source.finishPulling();
  }
});

By returning a promise, we ensure that finishPulling() is always called before the next pull calls startPulling() again. If we change this to not return a promise (by wrapping the code inside pull() with an async IIFE), then this is no longer guaranteed and the hypothetical source might break.

Alternatively: could we pass an AbortSignal to pull(), similar to #1015? That way, the implementer can return a (potentially long-running) promise from pull(), and still have a way to handle cancellation in a more timely manner.

ricea · 2019-09-13T11:44:15Z

I didn't mean to discourage it, just to make clear that not returning a promise is an option. Maybe if I think about it a bit more I can come up with some concrete guidelines for when to do what.

I don't know how I feel about abortable pull() yet. I will have to give it more thought.

MattiasBuelens · 2019-09-13T13:50:21Z

Yeah, I'm not sure about it either, just thinking out loud. The examples above are purely hypothetical, I haven't actually come across a real-world use case where pull() does something else after controller.enqueue().

Then again, passing an AbortSignal would make for a nice parallel between readable and writable streams:

A readable stream waits for a pending pull() before calling cancel(), but you can skip the wait by inspecting the signal passed to pull().
A writable stream waits for a pending write() before calling abort(), but you can skip the wait by inspecting the signal passed to write().

MattiasBuelens · 2021-05-11T15:11:12Z

I don't think this part is correct:

When a source blocks inside pull(), it is not possible to cancel the ReadableStream until the promise returned by pull() settles.

As far as I know, we don't actually wait for the pending pull() to settle before we call cancel() on the source.

var rs = new ReadableStream({
  async pull(c) {
    console.log("source pull");
    await new Promise(r => setTimeout(r, 1000));
    console.log("source enqueue");
    c.enqueue("a");
    console.log("source pull done");
  },
  cancel(reason) {
    console.log("source cancel", reason);
  }
}, { highWaterMark: 0 });

var reader = rs.getReader();
reader.read().then((result) => console.log("read", result));
setTimeout(() => reader.cancel("hi"), 100);

This logs:

source pull
source cancel hi
read {value: undefined, done: true}
source enqueue

The source receives the cancel() call immediately, and the pending read() is fulfilled right away. When pull() eventually calls enqueue(), it'll throw an error (which the stream silently handles, since it has already become canceled):

TypeError: Failed to execute 'enqueue' on 'ReadableStreamDefaultController': Cannot enqueue a chunk into a closed readable stream

This also means my previous suggestion for passing an AbortSignal to pull() isn't really necessary. You can already do this yourself using cancel():

var rs = new ReadableStream({
  start() {
    this._controller = new AbortController();
  },
  async pull(c) {
    const response = await fetch("./more/data/please", {
      signal: this._controller.signal
    });
    const data = await response.json();
    c.enqueue(data);
  },
  cancel(reason) {
    this._controller.abort();
  }
}, { highWaterMark: 0 });

ricea · 2021-05-11T18:22:10Z

I'm not sure what the source of my confusion was. We can probably close this.

MattiasBuelens · 2021-05-11T21:09:01Z

It's not just you. I even wrote code specifically to work around this "problem". But then I found that everything worked just fine without it, so I reverted it again. 😅 Moral of the story: test all your assumptions! 😁

I'm in favor of closing this.

ricea · 2021-05-11T21:32:27Z

Maybe we should add an explicit note that cancel() can be called during pull()? It's different from the way WritableStream works.

ricea changed the title ~~Clarify that a pull() source doesn't need to wait in pull~~ Clarify that a pull source doesn't need to wait in pull() Sep 13, 2019

MattiasBuelens mentioned this issue Jan 31, 2020

Reset for TransformStream (and WritableStream?) #1026

Open

MattiasBuelens mentioned this issue Jan 25, 2021

Make read requests abortable #1103

Closed

ricea closed this as completed May 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify that a pull source doesn't need to wait in pull() #1014

Clarify that a pull source doesn't need to wait in pull() #1014

ricea commented Sep 13, 2019

MattiasBuelens commented Sep 13, 2019

ricea commented Sep 13, 2019

MattiasBuelens commented Sep 13, 2019

MattiasBuelens commented May 11, 2021 •

edited

Loading

ricea commented May 11, 2021

MattiasBuelens commented May 11, 2021

ricea commented May 11, 2021

Clarify that a pull source doesn't need to wait in pull() #1014

Clarify that a pull source doesn't need to wait in pull() #1014

Comments

ricea commented Sep 13, 2019

MattiasBuelens commented Sep 13, 2019

ricea commented Sep 13, 2019

MattiasBuelens commented Sep 13, 2019

MattiasBuelens commented May 11, 2021 • edited Loading

ricea commented May 11, 2021

MattiasBuelens commented May 11, 2021

ricea commented May 11, 2021

MattiasBuelens commented May 11, 2021 •

edited

Loading