Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection.end() causes an emit of ECONNRESET #300

Closed
OJezu opened this issue Jan 24, 2014 · 12 comments · Fixed by #444
Closed

Connection.end() causes an emit of ECONNRESET #300

OJezu opened this issue Jan 24, 2014 · 12 comments · Fixed by #444

Comments

@OJezu
Copy link

OJezu commented Jan 24, 2014

When I run some of the tests, e.g

node test/test-basic-return.js

I get

/home/ch/node_modules/amqp/node_modules/longjohn/dist/longjohn.js:184
        throw e;
              ^
Error: read ECONNRESET
    at errnoException (net.js:901:11)
    at onread (net.js:556:19)
---------------------------------------------
    at module.exports.run (/home/ch/node_modules/amqp/test/harness.js:49:23)
    at Object.<anonymous> (/home/ch/node_modules/amqp/test/test-basic-return.js:1:84)
    at Module._compile (module.js:456:26)
    at Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Module._load (module.js:312:12)
    at Module.runMain (module.js:497:10)
    at startup (node.js:119:16)

It seems it is connected to #295 and #246, however the code from #248 does not solve it.

It is a hell of a bug, because it caused me to look for a few hours for a bug in exchange.publish, and then it turned out it is some networking stuff happening upon connection.end().

@djhojd
Copy link

djhojd commented Jan 30, 2014

+1, same here

@theartoflogic
Copy link

I am getting the same error, however, it's limited to only the servers in my prod environment. My local environment, dev, and stage all work fine. However, it's 100% reproducible on the prod servers. The interesting thing is that they all have the same server setup (except my local env).

I have tried using the latest stable version of the node-amqp library (that does not include the explicit connectionCloseOk handler), as well as with the connectionCloseOk handler and both have the same issue.

After using node-inspector to step through the entire execution flow (very, very tediously) I still could not figure out exactly why it was throwing the ECONNRESET error. The connectionClose event to the rabbit server gets sent fine, and I get the connectionCloseOk response from the server. However, once it calls socket.end() after that it always throw this exception.

If I use netstat to see how many sockets are open to the rabbit server I can see that the socket has not been closed, so eventually rabbit starts blocking the connections as my node-pool continues to create new connections (thinking the old ones were closed properly).

I've been able to fix the ECONNRESET error by modifying the call to socket.end() to be socket.destroy() instead after it receives the connectionCloseOk response. With that change I no longer get the ECONNRESET error, and the socket is properly closed. However, I'm not sure if that's the proper way to close the socket, or if there may be other unintended consequences that arise from that change, but so far I haven't noticed any ill effects and it's been running in my prod environment for the past few weeks.

Here is some information on the environment where this issue happens:

OS: Debian Squeeze (v6.0.8)
Node: v0.10.5
RabbitMQ: v3.1.5
node-amqp: v0.1.8

@jgato
Copy link

jgato commented Jan 31, 2014

I have a similar issue using node v0.10.X but it disappears if I use node 0.8. I am not saying this is a solution, but...

@rafaelkaufmann
Copy link

+1 for this issue on CentOS 6.3, Node 0.10.22, RabbitMQ 3.2.3, node-amqp v0.1.8. Also, I did not get the error on my development machine, which is a Mac OS Mavericks installation with same configurations otherwise.

@rafaelkaufmann
Copy link

@theartoflogic: you should be aware that socket.destroy() does not actually end the connection. I tried to use @glenjamin 's disconnect branch, which features a connectionCloseOk handler, and replaced socket.end() with socket.destroy() there. This did shut up the exception, but a look at rabbitmqadmin list connections afterwards will show you that the connection still exists as far as the Rabbit server is concerned. These dangling connections can very well build up...

@theartoflogic
Copy link

@rafaelkaufmann the socket.destroy() call, for me at least, does seem to close the socket. I did the same thing...I used a branch with the connectionCloseOk handler, changed socket.end() to socket.destroy() and everything started working properly: there was no longer an exception, the socket did not show up when using netstat -an | grep 5672, and the RabbitMQ admin console showed that the connection was no longer open.

I'm not sure why it wouldn't work for you, but I'd chalk it up to the weirdness of this bug in general. It seems to vary under what circumstances it happens, and it seems that it varies under which conditions it's "fixed" as well. For now, it's working for me and that's the most important thing. I'm still hoping for an actual fix for it though. I just wanted to present my findings as far as what worked for me in case it can help anyone else in the short term.

@rafaelkaufmann
Copy link

That's truly bizarre @theartoflogic. Thanks for sharing. 👍

@OJezu
Copy link
Author

OJezu commented Jan 31, 2014

It seems that RabbitMQ responds to TCP FIN packet with RST packet:

 35 1.061573000    127.0.0.1 -> 127.0.0.1    TCP 66 47322 > amqp [FIN, ACK] Seq=356 Ack=512 Win=32768 Len=0 TSval=123271910 TSecr=123271909
 36 1.061735000    127.0.0.1 -> 127.0.0.1    TCP 66 amqp > 47322 [RST, ACK] Seq=512 Ack=357 Win=32768 Len=0 TSval=123271910 TSecr=123271910

It happens no matter if the node-amqp is from master, or from #248. It can either be a bug or working-as-intented on RabbitMQ side. I would file a bug upstream, but no idea where their bug tracker is (mailing list?)

@glenjamin
Copy link
Contributor

My branch doesn't stop other bits of the system from trying to send data once the socket is closed - I was a bit lazy there.

Say you have a continuous publish(), then you trigger a close - the publish that happens after socket.end() / connectionCloseOk will cause the exception described above.

Possible Solutions:

  1. Make sure your app stops trying to send data after a close
  2. Extend the branch to make node-amqp throw a more obvious error in this scenario
  3. Call socket.end() and socket.destroy() in the handler - this should stop data being sent after the close think.

@rafaelkaufmann
Copy link

@glenjamin, I wound up fleeing from the problem :) Meaning I refactored my app to be less dependent on the details of connection management...

@OJezu
Copy link
Author

OJezu commented Feb 3, 2014

@glenjamin it is not problem of app sending data after close. It seems that RabbitMQ responds (on TCP level) with RST to client's FIN. I did some tracing, and there is no packet sent to RabbitMQ after amqp's Close. RST may be intentional, i.e. RabbitMQ does not wish to do full for way close, but cuts the connection short with RST. If that is the case, then probably something should handle and quiet ECONNRESET after closeOk is received.

@pruthvireddypuresoftware

test/test-basic-return.js: Error: read ECONNRESET
at TCP.onread (net.js:660:25)

at Object.run (/node/node-amqp/test/harness.js:49:23)
at Object.<anonymous> (/node/node-amqp/test/test-basic-return.js:1:22)
at Module._compile (internal/modules/cjs/loader.js:689:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:700:10)
at Module.load (internal/modules/cjs/loader.js:599:32)
at tryModuleLoad (internal/modules/cjs/loader.js:538:12)
at Function.Module._load (internal/modules/cjs/loader.js:530:3)
at Function.Module.runMain (internal/modules/cjs/loader.js:742:12)
at startup (internal/bootstrap/node.js:266:19)

Facing above issue in debian, nodejs v10.8 and RabbitMQ 3.3.5. Downloaded the current commit and connection.js file has destroy function called.
case methods.connectionCloseOk:
debug && debug("Received close-ok from server, closing socket");
this.socket.end();
this.socket.destroy();
break;

Please suggest me any other solution. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants