Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream: add stream.pipeline and stream.onEnd #19828

Closed
wants to merge 5 commits into from

Conversation

mafintosh
Copy link
Member

@mafintosh mafintosh commented Apr 5, 2018

Checklist
  • make -j4 test (UNIX), or vcbuild test (Windows) passes
  • tests and/or benchmarks are included
  • documentation is changed or added
  • commit message follows commit guidelines

Adds stream.pipeline(...streams, [cb]) and stream.onEnd(stream, cb) based on my two modules https://github.com/mafintosh/pump and https://github.com/mafintosh/end-of-stream.

Needs tests+docs (writing those now), but thought it'd be good to get this up now for feedback.

Supersedes #13506, as there's been changes since so making a new PR was easier

@nodejs-github-bot nodejs-github-bot added the lib / src Issues and PRs related to general changes in the lib or src directory. label Apr 5, 2018
@vsemozhetbyt vsemozhetbyt added stream Issues and PRs related to the stream subsystem. semver-minor PRs that contain new features and should be released in the next minor version. labels Apr 5, 2018
@mafintosh mafintosh force-pushed the add-pump-and-eos branch 3 times, most recently from 6cc85d2 to ca4f736 Compare April 5, 2018 11:34
@mafintosh
Copy link
Member Author

Goal is to get this out in 10, backports might be tricky as the core streams in 9, doesn't play well with error handling patterns used by pump.

@mscdex
Copy link
Contributor

mscdex commented Apr 5, 2018

Isn't the second commit (changing the order of emitted stream events) unrelated and would be better as a separate PR?

lib/stream.js Outdated
@@ -33,6 +35,9 @@ Stream.Duplex = require('_stream_duplex');
Stream.Transform = require('_stream_transform');
Stream.PassThrough = require('_stream_passthrough');

Stream.pipeline = pump;
Stream.onEnd = eos;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

onEnd is a confusing name in the context of streams, we should use something else

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open for suggestions :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we used finished, like in mississippi?

@mscdex
Copy link
Contributor

mscdex commented Apr 5, 2018

I think the two new internal modules could use some improvements code style-wise (they may not even pass the linter?) and performance-wise (e.g. replacing functional array methods).

@mscdex
Copy link
Contributor

mscdex commented Apr 5, 2018

Also, why does this need to live on require('stream') and not the more appropriate and more specific stream constructor(s)?

@mafintosh
Copy link
Member Author

@mscdex the order change is to make the streams play well with the pump code, and how it, in general should be emitted (mistake on my part when I updated the error handling). Happy to move that out if that's easier to review.

I'd very much prefer not to rewrite the source at this point. This is ported with minimal changes from my pump and end-of-stream module, to pass the node core linter, and there is lots of complexity in there.

I'm not I understand what you mean by constructors?

@mscdex
Copy link
Contributor

mscdex commented Apr 5, 2018

I'd very much prefer not to rewrite the source at this point. This is ported with minimal changes from my pump and end-of-stream module, to pass the node core linter, and there is lots of complexity in there.

IMO if it's being included directly in node core, we should strive to make the code as readable as possible and performant as possible (especially in this case because the public functions being added could be in hot paths). There isn't even that much code there, so I don't see why this is a problem. Some of the changes would even be pretty straightforward (e.g. replacing the functional array methods).

I'm not I understand what you mean by constructors?

e.g. require('stream').Readable

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some test regarding to http.OutgoingMessage (both on a client and on a server)? They are the special writable-like object.

Also for HTTP2 compat API.

lib/stream.js Outdated
@@ -22,6 +22,8 @@
'use strict';

const { Buffer } = require('buffer');
const pump = require('internal/streams/pump');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would name this pipeline instead.

});
}

run();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to call common.crashOnUnhandledRejection() at the top of the file.


await pipelinePromise(read, write);

// nexttick to avoid the promise catching it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this nextTick is not clear to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was to work around the need for common.crashOnUnhandledRejection() which i didn't know about 👍

read.push('data');
setImmediate(() => read.destroy(new Error('kaboom')));

pipeline(read, write, common.mustCall((err) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some test for the return value of pipeline?

if (Array.isArray(streams[0])) streams = streams[0];

if (streams.length < 2) {
throw new ERR_ASSERTION('pipeline requires two streams per minimum');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an assertion error no?

req.on('close', common.mustCall(() => {
req.on('error', common.mustCall(() => {
req.on('error', common.mustCall(() => {
req.on('close', common.mustCall(() => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these changes part of this PR, if so why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I messed up the emit order in my streams error handling PR (error should be before close). Otherwise these modules will report a premature close error instead of the error passed to stream.destroy. It's quite related to the end-of-stream/pump stuff so just bundled that in here. Want me to open an independent PR with that? (tests won't pass without this fix)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes please! If possible I would like to backport this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I don't think a backport is needed as the stream error handling stuff we did is going out in 10 only I think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, I would like to backport this PR to 8 if possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm splitting it out in a sec

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mafintosh
Copy link
Member Author

@mscdex Still not sure I follow, you want this exposed as stream.Readable.pipeline?

@mafintosh mafintosh force-pushed the add-pump-and-eos branch 3 times, most recently from c128bad to bd162fa Compare April 5, 2018 13:51
@@ -0,0 +1,95 @@
// Ported from https://github.com/mafintosh/pump with
// permission from the author, Mathias Buus (@mafintosh).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the original code has a copyright/license, then it should likely either be included here or embedded in the LICENSE file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, but I did mention here I'm fine waiving copyright, mafintosh/pump#17 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm fine with whatever is easiest btw)

if (Array.isArray(streams[0])) streams = streams[0];

if (streams.length < 2) {
throw new ERR_ASSERTION('pipeline requires two streams per minimum');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is for the assert module.

You should probably be using ERR_MISSING_ARGS here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

return streams.pop();
}

function pump(...streams) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you name this pipeline?

// where the client keeps the event loop going
// (replacing the rs.destroy() with req.end()
// exits it so seems to be a destroy bug there
client.unref();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be investigating this later on this afternoon :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing this with other Readable instances shows this working. For instance, replace the custom Readable above with const rs = fs.createReadStream(__filename) and the test appears to work just fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, just verified, commenting out the client.unref() and replacing the custom Readable with fs.createReadStream() and adjusting the cnt below a bit to account for the difference in the number of data events allows this test to pass and the Http2Stream shuts down normally. So the issue appears to be specific to the custom Readable?? /cc @mcollina

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasnell hmm, i don't understand how the Readable should affect this. it's just the trigger of the destruction of the pipeline. could it be that there is an http2 issue if a bunch of writes happen while the stream is destroyed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I'm aware of. At this point I'm a bit unsure what could be happening. I've tried a number of variations but can only reproduce it with the custom Readable. I've asked @mcollina to take a look. He said he will take a look tomorrow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mcollina... Please don't forget to take a look at this one :)

@mafintosh
Copy link
Member Author

@mcollina PTAL (docs missing, working on that now)

Copy link
Contributor

@vsemozhetbyt vsemozhetbyt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for many petty doc nits)

<a id="ERR_STREAM_PREMATURE_CLOSE"></a>
### ERR_STREAM_PREMATURE_CLOSE

An error returned by `stream.onEnd` and `stream.pipeline`, when a stream
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: `stream.onEnd` and `stream.pipeline` -> `stream.onEnd()` and `stream.pipeline()`?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be stream.finished() and not stream.onEnd(), since the former is what is currently used in this PR.

@@ -46,6 +46,8 @@ There are four fundamental stream types within Node.js:
* [Transform][] - Duplex streams that can modify or transform the data as it
is written and read (for example [`zlib.createDeflate()`][]).

Additionally this module includes the utility functions [pipeline][] and [finished][].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should be wrapped after 80 characters.

@@ -1263,6 +1265,106 @@ implementors should not override this method, but instead implement
[`readable._destroy`][readable-_destroy].
The default implementation of `_destroy` for `Transform` also emit `'close'`.

### stream.pipeline(...streams[, callback])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it OK to use rest parameters syntax with non-last parameter even just for convenience?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems stream.pipeline() section should go after stream.finished() section, alphabetically.

* `callback` {Function} A callback function that takes an optional error
argument.

A class method to pipe between streams forwarding errors and properly cleaning
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

class method -> module method?

const fs = require('fs');
const zlib = require('zlib');

// Use pipeline api to easily pipe a series of streams
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

api -> API?

console.log('Pipeline succeded');
}

run().catch(console.log);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.catch(console.log) -> .catch(console.error)?


finished(rs, (err) => {
if (err) {
console.log('Stream failed', err);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

console.log() -> console.error()?


Especially useful in error handling scenarios where a stream is destroyed
prematurely (like an aborted HTTP request), and will not emit `end`
or `finish`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, but maybe `end` or `finish` -> `'end'` or `'finish'` events?

prematurely (like an aborted HTTP request), and will not emit `end`
or `finish`.

The `finished` API is promisify'able as well
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as well -> as well:?

console.log('Stream is done reading');
}

run().catch(console.log);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.catch(console.log) -> .catch(console.error)?

@vsemozhetbyt vsemozhetbyt added the doc Issues and PRs related to the documentations. label Apr 6, 2018
@mafintosh
Copy link
Member Author

@vsemozhetbyt fixed your doc nits

@mafintosh
Copy link
Member Author

Rebased as #19836 landed, PTAL

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still LGTM

@jasnell jasnell added this to the 10.0.0 milestone Apr 9, 2018
@mafintosh
Copy link
Member Author

@BridgeAR BridgeAR mentioned this pull request Apr 10, 2018
@ronkorving
Copy link
Contributor

@mafintosh Sorry for bikeshedding. I'm not a native English speaker, but would like to ask... Can pipeline reasonably be used as a verb? If so, LGTM 👍 If not, I personally prefer a verby method, rather than a noun (which I read pipeline as). This could just be my lack of knowledge of the English language though.

@mafintosh
Copy link
Member Author

@ronkorving Not a native speaker as well, but I use pipeline as a verb and I think that's correct.

@mafintosh
Copy link
Member Author

@mscdex I plan on landing this Monday, so PTAL before if you have time :)

mafintosh added a commit that referenced this pull request Apr 16, 2018
PR-URL: #19828
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: James M Snell <[email protected]>
@mafintosh
Copy link
Member Author

Landed in f64bebf

@mafintosh mafintosh closed this Apr 16, 2018
@mafintosh mafintosh deleted the add-pump-and-eos branch April 16, 2018 14:05
jasnell pushed a commit that referenced this pull request Apr 16, 2018
PR-URL: #19828
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: James M Snell <[email protected]>
BridgeAR pushed a commit to BridgeAR/node that referenced this pull request May 1, 2018
PR-URL: nodejs#19828
Reviewed-By: Matteo Collina <[email protected]>
Reviewed-By: James M Snell <[email protected]>
@Trott Trott mentioned this pull request Nov 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Issues and PRs related to the documentations. lib / src Issues and PRs related to general changes in the lib or src directory. semver-minor PRs that contain new features and should be released in the next minor version. stream Issues and PRs related to the stream subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants