Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault using stddev on Ubuntu 15.10 #131

Closed
mkrb opened this issue Dec 15, 2015 · 6 comments
Closed

Segfault using stddev on Ubuntu 15.10 #131

mkrb opened this issue Dec 15, 2015 · 6 comments

Comments

@mkrb
Copy link

mkrb commented Dec 15, 2015

Hi,

stressed relay with graphite-stresser 'java -jar stresser.jar 127.0.0.1 2002 2000 1 1 false' and and todays github sources.

Settings: 'relay -p 2002 -f /etc/carbon-c-relay.conf'

[2015-12-15 19:23:12] starting carbon-c-relay v1.2 (c006ac), pid=14077
configuration:
relay hostname = akker
listen port = 2002
workers = 4
send batch size = 2500
server queue size = 25000
statistics submission interval = 60s
server connection IO timeout = 600ms
routes configuration = /etc/carbon-c-relay.conf

parsed configuration follows:
cluster lokal
any_of
127.0.0.1:2003
;

aggregate ^STRESS.host.(.+).com.graphite.stresser.(.+).(.+)
every 60 seconds
expire after 65 seconds
timestamp at start of bucket
compute max write to
stressagg.host.\1.\2.max
compute min write to
stressagg.host.\1.\2.min
compute stddev write to
stressagg.host.\1.\2.stdev
compute average write to
stressagg.host.\1.\2.avg
send to lokal
stop
;
match ^STRESS.
send to blackhole
stop
;
match *
send to lokal
;

[2015-12-15 19:23:12] listening on tcp4 0.0.0.0 port 2002
[2015-12-15 19:23:12] listening on udp4 0.0.0.0 port 2002
[2015-12-15 19:23:12] listening on UNIX socket /tmp/.s.carbon-c-relay.2002
[2015-12-15 19:23:12] starting 4 workers
[2015-12-15 19:23:12] starting aggregator
[2015-12-15 19:23:12] starting statistics collector
[2015-12-15 19:23:12] startup sequence complete
*** Error in `carbon-c-relay': double free or corruption (fasttop): 0x00007f633c0186f0 ***
Abgebrochen (Speicherabzug geschrieben) <-- segfault

Is this reproducible in other environments, too? Thank you!

@grobian
Copy link
Owner

grobian commented Dec 16, 2015

what kind of metrics does your stresser produce?

@grobian
Copy link
Owner

grobian commented Dec 16, 2015

your subject suggests you know it's in stddev, is that because it doesn't crash when not using the stddev one?

@grobian
Copy link
Owner

grobian commented Dec 16, 2015

591a271 may be the cause for this, but I'm still digging further to see if I can find something else.

grobian added a commit that referenced this issue Dec 16, 2015
@grobian
Copy link
Owner

grobian commented Dec 16, 2015

could you please try with v1.3?

@mkrb
Copy link
Author

mkrb commented Dec 16, 2015

Stresser tool is from here:
https://github.com/feangulo/graphite-stresser

Using average, max and min the relay was stable, adding stddev resulted in the segfaults. Further test revealed that percentile90 and percentile95 led to the segfaults, too.

Short test with new clone: stable, segfault is gone until I ^C the relay:

[2015-12-16 19:13:41] listening on tcp4 0.0.0.0 port 2002
[2015-12-16 19:13:41] listening on udp4 0.0.0.0 port 2002
[2015-12-16 19:13:41] listening on UNIX socket /tmp/.s.carbon-c-relay.2002
[2015-12-16 19:13:41] starting 4 workers
[2015-12-16 19:13:41] starting aggregator
[2015-12-16 19:13:41] starting statistics collector
[2015-12-16 19:13:41] startup sequence complete
^C[2015-12-16 19:22:20] caught SIGINT, terminating...
[2015-12-16 19:22:20] shutting down...
[2015-12-16 19:22:20] listeners for port 2002 closed
[2015-12-16 19:22:20] collector stopped
*** Error in `/root/carbon-c-relay/relay': free(): invalid pointer: 0x00007ff8c81f0530 ***
Abgebrochen (Speicherabzug geschrieben)

or

^C[2015-12-16 19:27:53] caught SIGINT, terminating...
[2015-12-16 19:27:53] shutting down...
[2015-12-16 19:27:53] listeners for port 2002 closed
[2015-12-16 19:27:54] collector stopped
*** Error in `/root/carbon-c-relay/relay': double free or corruption (out): 0x00007f99181c2c10 ***

This happens in all tested combinations, e.g. using avg, max, min only and adding stdev etc. to the basic set.

Thank you for the efforts (and the really performant relay)!

@grobian
Copy link
Owner

grobian commented Dec 16, 2015

Ok, variance, stddev, percentileX, mean all share the same "new" code to store the actual values, that's why they give problems. I'll have a go at your stresser. It seems the problem has moved itself to invalid cleanup, so I believe I "improved" the situation considerably ;)

grobian added a commit that referenced this issue Dec 24, 2015
This is fixes a crash/invalid read on shutdown or config reload, as
noted in issues #126, #131 and #132.
@grobian grobian closed this as completed Dec 24, 2015
pkittenis pushed a commit to pkittenis/carbon-c-relay that referenced this issue Feb 3, 2016
pkittenis pushed a commit to pkittenis/carbon-c-relay that referenced this issue Feb 3, 2016
This is fixes a crash/invalid read on shutdown or config reload, as
noted in issues grobian#126, grobian#131 and grobian#132.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants