_stream_writable.js - doWrite() runs in an endless loop #1992

cookiengineer · 2015-06-16T18:25:42Z

Sometimes (on weird clients connecting) the same error happens again. The call stack size exceeds in _stream_writeable.js' doWrite() method.

Why is this the case?
What different scenarios could trigger this?
How to debug / how to fix it?

_stream_writable.js:287
function doWrite(stream, state, writev, len, chunk, encoding, cb) {
                ^
RangeError: Maximum call stack size exceeded
    at doWrite (_stream_writable.js:287:17)
    at writeOrBuffer (_stream_writable.js:282:5)
    at Socket.Writable.write (_stream_writable.js:210:11)
    at Socket.write (net.js:601:40)

Update / TL;DR

net.Socket.write() might trigger an error. If the protocol implementation still tries to send a close code (e.g. by WebSockets spec), this leads to an endless loop.

The text was updated successfully, but these errors were encountered:

mscdex · 2015-06-16T19:05:33Z

What is your code doing? HTTP? Plain TCP? Something else?

cookiengineer · 2015-06-16T19:22:19Z

It's implementing raw WebSockets via net.Socket and I think this bug is related to the close/end event that is triggered on disconnect.

I'm unsure about it, because it only happens when some blackhat scripts are trying to DDoS our servers just before. So I guess the events might be fired in false order due to their falsy raw socket connections or similar.

The close() method in the implementation is calling socket.write() with the proper websocket close code and then has as in the callback the socket.end() and socket.destroy() call.

So the close() method looks something like this (reduced):

lychee.net.protocol.WS.prototype.close = function(status) {
    // (...) prepare buffer with String.fromCharCode() stuff

    this.socket.write(buffer, function(err) {
        this.end();
        this.destroy();
    }.bind(this.socket));
};

I'm unsure if I'm using the events API correctly as there's no up2date docs available that show how the event flow is actually like :-/ Here's what the event bindings look like, before filing the issue, the "end" event was also bound. I'm chiming in back here if the error still occurs and is related to another event (or if I can trace down my falsy API usage).

        this.socket.on('error', function() {
            that.close(Class.STATUS.protocol_error);
        });

        this.socket.on('timeout', function() {
            that.close(Class.STATUS.going_away);
        });

        // this.socket.on('end', function() {
        //  that.close(Class.STATUS.normal_closure);
        // });

        this.socket.on('close', function() {
            that.close(Class.STATUS.normal_closure);
        });

trevnorris · 2015-06-16T21:01:43Z

See how cb is passed to doWrite()? That callback is called synchronously from Socket#_writeGeneric() if the write queue can be flushed immediately by the kernel (via uv_try_write()). So if you're able to quickly process many small packets it's possible to catch this in a synchronous loop.

I believe this is a related issue: nodejs/node-v0.x-archive#6623

cookiengineer · 2015-06-17T22:38:16Z

I think I traced the bug.

Does net.Socket.write() trigger an error event? That would cause an endless loop in my implementation, because on error, the implementation still tries to close it properly according to the WebSocket spec.

trevnorris · 2015-06-17T22:42:58Z

Yes. There are two places it can error. First is if you're attempting to write to a closed socket, the other is if the actual write failed (i.e. uv_write()).

sam-github · 2015-06-18T22:08:40Z

I'd expect an error to be unrecoverable, trying to write a ws spec error packet to tell the other side you got a write failure when writing to the other side isn't going to work. socket.destroy() is probably the right choice on error.

saghul · 2015-06-24T15:28:30Z

the other is if the actual write failed (i.e. uv_write()).

FTR, if the actual write(2) fails inside uv_write the error is deferred to the callback. From Node/iojs prespective, any error returned by uv_write is an internal error and you may as well abort(). (Note that when multiple (> 4) buffers are passed this may fail with ENOMEM as well)

trevnorris · 2015-06-24T16:48:04Z

@saghul Thanks for the information. So basically the extra logic here: https://github.com/nodejs/io.js/blob/v2.3.1/lib/net.js#L668-L672 is totally pointless. About ENOMEM, does that depend on the number of buffers only, or the total length of buffers?

@bnoordhuis So instead should we just FatalError() from StreamWrap::DoWrite() if an error code is returned? An exception might be the ENOMEM case.

bnoordhuis · 2015-06-24T17:25:06Z

There's also EBADF. If you get that error it's almost certainly an application bug (something closed the file descriptor from underneath the handle) but a hard abort isn't the nicest way to report that.

trevnorris · 2015-06-24T18:01:30Z

@bnoordhuis Think it would be reasonable for us to filter the error codes, return those that are application bugs and abort on internal errors?

saghul · 2015-06-24T18:55:35Z

About ENOMEM, does that depend on the number of buffers only, or the total length of buffers?

No, we malloc and copy the structures, not the buffer content.

There's also EBADF. If you get that error it's almost certainly an application bug (something closed the file descriptor from underneath the handle) but a hard abort isn't the nicest way to report that.

True that. Not sure what the policy is in Node, but I guess this should never reach the user, it kind of implies an internal error in Node, IMHO.

bnoordhuis · 2015-06-24T20:15:07Z

Not sure what the policy is in Node, but I guess this should never reach the user, it kind of implies an internal error in Node, IMHO.

It could be a bug in the user's code: fs.close(wrongFd)

saghul · 2015-06-24T20:28:18Z

Doh, indeed. Sorry for the noise then.

trevnorris · 2015-06-24T21:12:22Z

Just to verify, then there aren't any errors we'd want to abort on?

Fishrock123 · 2015-07-23T23:45:14Z

So... is this a bug, or?

cookiengineer · 2015-07-30T11:02:05Z

From my side, it's still unclear what I should do in order to support a correct WebSocket behaviour. As @sam-github mentioned, the destroy event should be the one to use for a websocket close code. For me it's unclear how to detect if the socket was closed properly from the foreign side or if the server closes the socket because of a real spec-defined error.

Is there an XOR condition between socket destroy and socket end?

Or can the end event lead to a destroy event (or vice versa)?
If so, that might not be wanted from the implementor's side for most TCP based protocols as pretty much all have some kind of status codes for socket conditions and it wouldn't make sense to have an error in the web browser if the user just reloaded the page and the socket got destructed.

jasnell · 2016-03-09T03:39:38Z

@trevnorris @saghul ... is there reason to keep this one open?

jasnell · 2016-04-02T07:06:23Z

Closing given the lack of further activity.

cookiengineer · 2016-04-02T07:17:58Z

Really? I don't know, but I think this is a security issue as it can crash a node server implementation, given that the attacker sends the right tcp events in the right order. If this isn't an issue, then at least provide better docs about which event can potentially fire which kind of error event. Otherwise pretty any implementation that handles TCP close frames will be broken.

jasnell · 2016-04-02T07:21:35Z

If someone can confirm that this a bug that needs fixing (@saghul @trevnorris @bnoordhuis ?) we can definitely reopen. Just closed because there's been no further discussion or investigation on it since last July.

jasnell · 2016-04-02T07:21:55Z

Can you verify whether this is still an issue in master?

brendanashworth added the stream Issues and PRs related to the stream subsystem. label Jun 16, 2015

jasnell closed this as completed Apr 2, 2016

This was referenced Apr 24, 2023

[Snyk] Fix for 13 vulnerabilities aliscco/alisco-node#112

Open

[Snyk] Fix for 5 vulnerabilities aliscco/alisco-node#348

Open

aliscco mentioned this issue Jun 24, 2023

[Snyk] Fix for 1 vulnerabilities aliscco/alisco-node#574

Open

This was referenced Jul 7, 2023

[Snyk] Fix for 4 vulnerabilities leonardoadame/node#63

Open

[Snyk] Fix for 9 vulnerabilities leonardoadame/node#128

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

_stream_writable.js - doWrite() runs in an endless loop #1992

_stream_writable.js - doWrite() runs in an endless loop #1992

cookiengineer commented Jun 16, 2015

mscdex commented Jun 16, 2015

cookiengineer commented Jun 16, 2015

trevnorris commented Jun 16, 2015

cookiengineer commented Jun 17, 2015

trevnorris commented Jun 17, 2015

sam-github commented Jun 18, 2015

saghul commented Jun 24, 2015

trevnorris commented Jun 24, 2015

bnoordhuis commented Jun 24, 2015

trevnorris commented Jun 24, 2015

saghul commented Jun 24, 2015

bnoordhuis commented Jun 24, 2015

saghul commented Jun 24, 2015

trevnorris commented Jun 24, 2015

Fishrock123 commented Jul 23, 2015

cookiengineer commented Jul 30, 2015

jasnell commented Mar 9, 2016

jasnell commented Apr 2, 2016

cookiengineer commented Apr 2, 2016

jasnell commented Apr 2, 2016

jasnell commented Apr 2, 2016

_stream_writable.js - doWrite() runs in an endless loop #1992

_stream_writable.js - doWrite() runs in an endless loop #1992

Comments

cookiengineer commented Jun 16, 2015

mscdex commented Jun 16, 2015

cookiengineer commented Jun 16, 2015

trevnorris commented Jun 16, 2015

cookiengineer commented Jun 17, 2015

trevnorris commented Jun 17, 2015

sam-github commented Jun 18, 2015

saghul commented Jun 24, 2015

trevnorris commented Jun 24, 2015

bnoordhuis commented Jun 24, 2015

trevnorris commented Jun 24, 2015

saghul commented Jun 24, 2015

bnoordhuis commented Jun 24, 2015

saghul commented Jun 24, 2015

trevnorris commented Jun 24, 2015

Fishrock123 commented Jul 23, 2015

cookiengineer commented Jul 30, 2015

jasnell commented Mar 9, 2016

jasnell commented Apr 2, 2016

cookiengineer commented Apr 2, 2016

jasnell commented Apr 2, 2016

jasnell commented Apr 2, 2016