-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash: runtime: unexpected return pc for strconv.atof64 called from 0x... #2145
Comments
please provide the entire config file if possible |
might be related to the process receiving a SIGPIPE golang/go#17393 |
@sparrc Here's the entire config file, with comments removed and sensitive information replaced with dummy information: |
I'm a bit baffled by this, googling the first error message only results in this exact bug case. The other part I don't understand is that there is a traceback that indicates that strconv is getting called in agent.go....agent.go doesn't even import the strconv package. @robinsmidsrod can you confirm that the traceback always looks the same as the one that you've provided? Can you provide any other information on these systems? Is there any possibility that memory corruption is happening?
|
@robinsmidsrod any chance you could run version 1.1.1 with the race detector turned on? There is a linux binary available from our CI system available here: https://5535-33258973-gh.circle-artifacts.com/0/tmp/circle-artifacts.JpcxCqN/telegraf-race.gz |
@sparrc I was also a bit baffled about the message, since strconv was not imported in the source file. Are there any performance penalties when running with the race detector enabled? These errors are happening in our production environment, so performance is crucial. There is a possibility that there could be memory corruption going on. The VMs where this is happening are next gen Rackspace Cloud machines. |
yes, unfortunately there is a performance penalty when using the race detector. |
@sparrc Sorry for the late response. What kind of performance penalty can I expect when running with the race detector on? |
I'm not sure exactly, but it's substantial, at least a 75% performance hit |
pushing out the milestone as steps to reproduce this are unclear |
I haven't been able to use a binary with the race detector yet, but I've got some more stack traces to share with you. The two next ones comes from the same machine, with about 10 hours between (10s measure interval), and the last one from another server with the same setup.
|
@robinsmidsrod Thanks for the new stacktraces, are these still with Telegraf 1.1? |
@danielnelson Forgot to mention that these new ones are from version 1.3.5. We've diligently upgraded to the latest version in hopes of getting rid of these kinds of errors over time. |
@robinsmidsrod Does this issue still occur with Telegraf 1.9 or newer? |
@danielnelson Actually, we've been using 1.10.0 on our cluster of around 50 machines for some time now, and I just now checked the logs and it seems like we haven't seen the message mentioned in the original bug report in at least a week, so it seems like it is actually fixed. I think we can close this one. Most likely it fixed itself as a side-effect of some other change to the codebase. If it ends up coming back I can always reopen this issue. |
Bug report
Telegraf crashes because of this error:
Relevant telegraf.conf:
I'm not sure if any part of the config file is relevant. Please advice if you want any part of it.
System info:
Telegraf v1.1.1 (git: release-1.1.0 94de9dc)
Ubuntu 14.04.5 LTS
Steps to reproduce:
Expected behavior:
Telegraf should not crash and continue running.
Actual behavior:
Telegraf crashes and have to be restarted by process supervision tool.
Additional info:
telegraf_crash.txt
The text was updated successfully, but these errors were encountered: