-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: unexpected return pc for golang.gopark #47003
Comments
This looks like memory corruption. Have you tried running your program under the race detector? See https://blog.golang.org/race-detector . |
yes I've tried using the race-detector and there is no complaints from the tests Just to make sure I merely need to run |
The race detector can only detect races if your test code exercises them. The litmus test is to pass -race to go build and deploy the application. |
@davecheney oh, thanks, that's something I missed out, I will deploy the race-enabled version to see if we can catch anything and report back |
Unfortunately ... enabled race detector does not give me more information regarding the issue.
|
Since you can reproduce the issue, can you provide complete instructions to let others do so? |
@networkimprov sorry for the late reply, I cannot really reproduce it , I just keep it running in our Staging environment, and the occurrence seems random to me. In addition to enable race on the build, I've enabled core dump this time, hope I can get more info and share here later. Just my feeling, it could be a runtime issue with memory management |
Luckily I caught a coredump this time, but the backtrace is mysterious to me also
|
Corresponding stderr goes like this
|
Output from
|
perhaps related, I got another stack from a user as below:
|
Feels like starting from go 1.16 the new GC may accidentally corrupt some mem area of parked or running goroutines ? |
@imcom Anything is possible, but as far as I can tell you are the only person reporting this specific problem. That makes it seem more likely to be due to memory corruption in your program rather than in the garbage collector. Memory corruption could come, for example, from uses of unsafe, or calls to C code, or memory races. Unfortunately it's going to be hard for us to analyze this if we can't reproduce it. Is it possible for you to share the code with us, even if the problem doesn't happen very often? |
Hi @ianlancetaylor! Glad to hear from you. Actually I am not the only one who experienced this, there are 3 ppl that I know of. But indeed, we are working with the same program GoBGP. I am pretty sure there is no CGO involved, but since the project uses reflection so unsafe is there. I tried race-detector enabled build but it did not produce any meaningful dump if any. I am also writing a fuzzing tool on Side note on ppl experienced this, we are not in the same organisation and our use cases/setup are different |
Was this fixed by 08ecdf7 (Go 1.18)? That commit links to #49686 which has a similar looking stack trace: https://build.golang.org/log/c443a4442d00c324be6e09f70d5e3bc401493531 |
@tomfitzhenry , I don't think this was fixed by 08ecdf7. That commit only affects weak memory architectures, and this issue is being reporting on amd64, which is not a weak memory architecture. Other than being panics on the runtime stack, the two stack traces look pretty dissimilar to me. |
Timed out in state WaitingForInfo. Closing. (I am just a bot, though. Please speak up if this is a mistake or you have the requested information.) |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Did not try
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I was running gobgpd:v2.23 in production environment and I have no clue how to troubleshoot or reproduce this issue.
This only happened once in my production environment, no applicable logs when the crash happened
What did you expect to see?
I do not see why there is this unexpected return pc fatal error. My main concern is that since I do not know why it happened, I cannot tell when it will happen again if ever. Also I need help on interpreting the error dump and how to troubleshoot a runtime error like this. Thanks in advance.
What did you see instead?
The text was updated successfully, but these errors were encountered: