Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify that a pull source doesn't need to wait in pull() #1014

Closed
ricea opened this issue Sep 13, 2019 · 7 comments
Closed

Clarify that a pull source doesn't need to wait in pull() #1014

ricea opened this issue Sep 13, 2019 · 7 comments

Comments

@ricea
Copy link
Collaborator

ricea commented Sep 13, 2019

When a source blocks inside pull(), it is not possible to cancel the ReadableStream until the promise returned by pull() settles. In some cases pull() may never return.

However, it is not necessary to block in pull(). In many (most?) cases, returning an already-resolved promise from pull() and then enqueueing later to trigger the next call to pull() works fine.

This should be clarified in the description of underlying source.

@ricea ricea changed the title Clarify that a pull() source doesn't need to wait in pull Clarify that a pull source doesn't need to wait in pull() Sep 13, 2019
@MattiasBuelens
Copy link
Collaborator

I don't think we should discourage returning a promise. Imagine a ReadableStream that enqueues multiple chunks after each pull():

const rs = new ReadableStream({
  async pull(c) {
    await new Promise(r => setTimeout(r, 100));
    c.enqueue('a');
    c.enqueue('b');
  }
}, { highWaterMark: 2 });

After c.enqueue('a') is called, the stream does not immediately call pull() again since the promise from the previous pull is still pending. After the second enqueue, the queue is filled up to the HWM, so the stream no longer needs to call pull() again. As a result, this stream will call pull() only once at start-up, as expected.

Compare this to a version that does not return a promise:

const rs = new ReadableStream({
  pull(c) {
    (async () => {
      await new Promise(r => setTimeout(r, 100));
      c.enqueue('a');
      c.enqueue('b');
    })();
  }
}, { highWaterMark: 2 });

After c.enqueue('a') is called, the stream immediately calls pull() again, since it only has 1 chunk in its queue and it wants to fill up to HWM = 2. However, it does not know that we call c.enqueue('b') immediately afterwards, so we end up filling up past the HWM with 4 chunks in the queue.

This is probably a bit of a contrived example. But you could have a similar problem if you need to do some cleanup after each pull(), for example:

// imagine there's a hypothetical `source` somewhere
const rs = new ReadableStream({
  pull(c) {
    await source.startPulling();
    const chunk = await source.getNextChunk();
    c.enqueue(chunk);
    await source.finishPulling();
  }
});

By returning a promise, we ensure that finishPulling() is always called before the next pull calls startPulling() again. If we change this to not return a promise (by wrapping the code inside pull() with an async IIFE), then this is no longer guaranteed and the hypothetical source might break.


Alternatively: could we pass an AbortSignal to pull(), similar to #1015? That way, the implementer can return a (potentially long-running) promise from pull(), and still have a way to handle cancellation in a more timely manner.

@ricea
Copy link
Collaborator Author

ricea commented Sep 13, 2019

I didn't mean to discourage it, just to make clear that not returning a promise is an option. Maybe if I think about it a bit more I can come up with some concrete guidelines for when to do what.

I don't know how I feel about abortable pull() yet. I will have to give it more thought.

@MattiasBuelens
Copy link
Collaborator

Yeah, I'm not sure about it either, just thinking out loud. The examples above are purely hypothetical, I haven't actually come across a real-world use case where pull() does something else after controller.enqueue().

Then again, passing an AbortSignal would make for a nice parallel between readable and writable streams:

  • A readable stream waits for a pending pull() before calling cancel(), but you can skip the wait by inspecting the signal passed to pull().
  • A writable stream waits for a pending write() before calling abort(), but you can skip the wait by inspecting the signal passed to write().

@MattiasBuelens
Copy link
Collaborator

MattiasBuelens commented May 11, 2021

I don't think this part is correct:

When a source blocks inside pull(), it is not possible to cancel the ReadableStream until the promise returned by pull() settles.

As far as I know, we don't actually wait for the pending pull() to settle before we call cancel() on the source.

var rs = new ReadableStream({
  async pull(c) {
    console.log("source pull");
    await new Promise(r => setTimeout(r, 1000));
    console.log("source enqueue");
    c.enqueue("a");
    console.log("source pull done");
  },
  cancel(reason) {
    console.log("source cancel", reason);
  }
}, { highWaterMark: 0 });

var reader = rs.getReader();
reader.read().then((result) => console.log("read", result));
setTimeout(() => reader.cancel("hi"), 100);

This logs:

source pull
source cancel hi
read {value: undefined, done: true}
source enqueue

The source receives the cancel() call immediately, and the pending read() is fulfilled right away. When pull() eventually calls enqueue(), it'll throw an error (which the stream silently handles, since it has already become canceled):

TypeError: Failed to execute 'enqueue' on 'ReadableStreamDefaultController': Cannot enqueue a chunk into a closed readable stream

This also means my previous suggestion for passing an AbortSignal to pull() isn't really necessary. You can already do this yourself using cancel():

var rs = new ReadableStream({
  start() {
    this._controller = new AbortController();
  },
  async pull(c) {
    const response = await fetch("./more/data/please", {
      signal: this._controller.signal
    });
    const data = await response.json();
    c.enqueue(data);
  },
  cancel(reason) {
    this._controller.abort();
  }
}, { highWaterMark: 0 });

@ricea
Copy link
Collaborator Author

ricea commented May 11, 2021

I'm not sure what the source of my confusion was. We can probably close this.

@MattiasBuelens
Copy link
Collaborator

It's not just you. I even wrote code specifically to work around this "problem". But then I found that everything worked just fine without it, so I reverted it again. 😅 Moral of the story: test all your assumptions! 😁

I'm in favor of closing this.

@ricea
Copy link
Collaborator Author

ricea commented May 11, 2021

Maybe we should add an explicit note that cancel() can be called during pull()? It's different from the way WritableStream works.

@ricea ricea closed this as completed May 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants