Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

after HUP :: failed to write() to localhost:2033: Broken pipe #341

Closed
kolobaev opened this issue Aug 17, 2018 · 5 comments
Closed

after HUP :: failed to write() to localhost:2033: Broken pipe #341

kolobaev opened this issue Aug 17, 2018 · 5 comments

Comments

@kolobaev
Copy link

kolobaev commented Aug 17, 2018

carbon-c-relay v3.3
metricsReceived / min ~= 2Mil

Hello! I had a problem!
After I change the configuration file (add +1 match for balckhole), and I send the HUP signal to the PID carbon-s-relay, he pauses for a few seconds, and starts writing that it can not connect to the backend (localhost:2033). After which, port 2003 not available.

log:

configuration:
    relay hostname = relay03
    workers = 56
    send batch size = 20000
    server queue size = 10000000
    server max stalls = 4
    listen backlog = 32
    server connection IO timeout = 5000ms
    idle connections disconnect timeout = 10m
    extra allowed characters = +/=_-:#
    configuration = /etc/carbon-c-relay.conf
...

[2018-08-17 12:24:32] (MSG) caught SIGHUP
[2018-08-17 12:24:32] (MSG) closing logfile
[2018-08-17 12:24:32] (MSG) reopening logfile
[2018-08-17 12:24:32] (MSG) reloading config from '/etc/carbon-c-relay.conf'
--- /tmp/carbon-c-relay_route.4gQzXQ    2018-08-17 12:24:32.417354582 +0300
+++ /tmp/carbon-c-relay_route.IryQC4    2018-08-17 12:24:32.417354582 +0300
@@ -32,7 +32,6 @@

 match
         .+carastral_numbers_out_.*
-        munin.stats_api_functions
         queues.rp_
         site-functional-test-
         rp-201
[2018-08-17 12:24:32] (MSG) closed listener for tcp :2003
[2018-08-17 12:24:32] (MSG) closed listener for udp :2003
[2018-08-17 12:24:32] (MSG) reloading collector
[2018-08-17 12:24:32] (MSG) interrupting workers
[2018-08-17 12:24:32] (MSG) expiring aggregations
[2018-08-17 12:24:32] (MSG) listening on tcp4 0.0.0.0 port 2003
[2018-08-17 12:24:32] (MSG) listening on tcp6 :: port 2003
[2018-08-17 12:24:32] (MSG) listening on udp4 0.0.0.0 port 2003
[2018-08-17 12:24:32] (MSG) listening on udp6 :: port 2003
[2018-08-17 12:24:32] (MSG) reloading workers
[2018-08-17 12:24:33] (MSG) SIGHUP handler complete
[2018-08-17 12:24:54] (ERR) failed to write() to localhost:2033: Broken pipe
[2018-08-17 12:24:54] (ERR) server localhost:2033: OK
# nc -zv -w1 localhost 2003
nc: connect to localhost port 2003 (tcp) failed: Connection timed out
# service carbon-c-relay restart
Ok
# nc -zv localhost 2003
Connection to localhost 2003 port [tcp/cfinger] succeeded!
@grobian
Copy link
Owner

grobian commented Aug 17, 2018

Hmm, not nice indeed

@kolobaev
Copy link
Author

kolobaev commented Aug 20, 2018

Also:
We conducted a test, its results are visible on the chart, below is a detailed description of what happened to the system:

screenshot_2018-08-20 grafana - carbon-c-relay

START
10:41:01 :: healthy.ch : nc -zv -w1 localhost 2003 : Connection to localhost 2003 port [tcp/cfinger] succeeded!
10:41:00 :: kill -HUP $pidRelay
10:41:01 :: carbon-c-relay began to reload now
10:41:01 :: healthy.ch : connect to localhost port 2003 (tcp) timed out: Operation now in progress
10:41:02 :: BGP :: availability check is broken - We stopping bgp session between relay03 and juniper
10:41:03 :: BGP :: bgp session stopped(-relay03). Rebalancing received metrics
10:41:04 :: carbon-c-relay SIGHUP handler complete
10:41:05 :: healthy.ch : Connection to localhost 2003 port [tcp/cfinger] succeeded!
10:42:00 :: BGP :: create a bgp session between relay03 and juniper.
10:42:01 :: bgp session created(+relay03). Rebalancing received metrics
10:43:00 :: relay worked but, metrics connection/disconnections == 0, metricSent == 0, metricReceived == 0, and 2003 port not availably (timeout)
10:45:30 :: service carbon-c-relay restart
10:46:00 :: carbon-c-realy status OK! 
END

@grobian
Copy link
Owner

grobian commented Sep 3, 2018

what is port 2203 in your scenario? is that another service (e.g. outside the relay)?

@grobian
Copy link
Owner

grobian commented Sep 3, 2018

sorry I meant port 2033

@grobian
Copy link
Owner

grobian commented Sep 3, 2018

with the diff that you show in your first comment, it shouldn't close the listener at all

@grobian grobian closed this as completed in 616a58c Sep 4, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants