Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When an assert fires FGenericPlatformMisc::RaiseException is tracked instead than the actual error #514

Closed
Edstub207 opened this issue Mar 12, 2024 · 22 comments · Fixed by #537
Labels
Bug Something isn't working

Comments

@Edstub207
Copy link
Contributor

Edstub207 commented Mar 12, 2024

Environment

How do you use Sentry?
Sentry SaaS (sentry.io)

Which version of the SDK?
0.16.0

How did you install the package? (Git-URL, Assetstore)
Git Download

Which version of Unreal?
5.3.2

Is this happening in Unreal (editor) or on a player like Android, iOS, Windows?
Linux Server

Steps to Reproduce

  1. Run a server
  2. Observe that instead than the assert FGenericPlatformMisc::RaiseException is tracked as the issue instead

Expected Result

The assert is the captured event in Sentry

Actual Result

The default raise exception is captured and the assert is "lost"

Any logs or screenshots

Log cuts off at:
[2024.03.12-14.00.37:849][466]LogOutputDevice: Warning:

When just after that the following occurs on the actual session log:

[2024.03.12-15.48.57:139][448]LogSentrySdk: flushing session and queue before crashpad handler
[2024.03.12-15.48.57:139][448]LogSentrySdk: invoking `on_crash` hook
[2024.03.12-15.48.57:140][448]LogSentrySdk: sending envelope
[2024.03.12-15.48.57:140][448]LogSentrySdk: serializing envelope into buffer
[2024.03.12-15.48.57:140][448]LogSentrySdk: handing control over to crashpad
CommonUnixCrashHandler: Signal=11
[2024.03.12-15.48.57:300][448]LogCore: === Critical error: ===
Unhandled Exception: SIGSEGV: invalid attempt to read memory at address 0x0000000000000008

[2024.03.12-15.48.57:300][448]LogCore: Fatal error!
@Edstub207 Edstub207 changed the title When an assert fires FGenericPlatformMisc::RaiseException is tracked instread than the actual assert When an assert fires FGenericPlatformMisc::RaiseException is tracked instread than the actual error Mar 12, 2024
@Edstub207 Edstub207 changed the title When an assert fires FGenericPlatformMisc::RaiseException is tracked instread than the actual error When an assert fires FGenericPlatformMisc::RaiseException is tracked instead than the actual error Mar 12, 2024
@tustanivsky
Copy link
Collaborator

Thank you, we'll check this on our end.

Btw is this issue something you've encountered just after upgrading to 0.16.0 or it was there previously as well?

@Edstub207
Copy link
Contributor Author

Hey @tustanivsky It's always been a thing as far as I know, however, we skipped from 0.8.0 to 0.16.0 so unsure if there is a regression in-between those versions that brought it back if this has been fixed before.

@bitsandfoxes
Copy link
Contributor

Just to get it right @Edstub207: You're asserting something and running into the thing that was supposed to be asserted right afterward? Are you able to provide us with a minimal repro?

@bitsandfoxes bitsandfoxes added the Bug Something isn't working label Apr 2, 2024
@Edstub207
Copy link
Contributor Author

@bitsandfoxes Yeah, when we hit an assert Sentry tracks FGenericPlatformMisc::RaiseException - Which is a generic bucket in Unreal and doesn't actually track the assert itself.

As a result we also don't have the full logs sent to Sentry either, because it's sending the logs at the time of FGenericPlatformMisc::RaiseException not whatever the assert actually is, which properly logs/outputs just after FGenericPlatformMisc::RaiseException.

@tustanivsky
Copy link
Collaborator

I think this is also related to #489 and #503. Basically, there's a whole bunch of situations when several top callstack frames of completely different events/errors could be the same which hides their origin and prevents them from being grouped properly at Sentry. I've been researching some alternative approaches of attaching only relevant callstack part to captured events lately and about to test those on my end to see whether it'll help to resolve the above issue.

@tustanivsky
Copy link
Collaborator

@Edstub207 I've created a draft PR which is supposed helping to resolve this issue. Currently fix works only for Win/Linux yet if the suggested approach appears to be viable porting it to other platforms should be pretty straightforward. The corresponding plugin package can be downloaded from our CI here so if you could give it a try sometime we'd really appreciate your feedback.

@imgamesgustav
Copy link

@tustanivsky I will give it a try as well. We are on windows :)

@Tobias-Nilsson-Sharkmob
Copy link

Tobias-Nilsson-Sharkmob commented Apr 16, 2024

I've tried this on our setup,
For asserts and other crashes, this seems to work well.
However, ensures on Linux servers seem to no longer generate any new issues in Sentry.
I've not done too much digging into this, but it seems to work well on Windows clients & seems to display slightly more info on Sentry issues. (The issue name is now "assertion failed", the Function name is rendered next to the issue name)

@tustanivsky Has the behavior been verified on Linux on your end? If so, this might be a setup issue on our end.

@tustanivsky
Copy link
Collaborator

Thanks for the heads up @Tobias-Nilsson-Sharkmob, this seemed to work as expected on Linux but I'll double-check on my side and let you know.

Just wanted to clarify a few things here:

  • Is this behavior specific to Linux Server only?
  • For this configuration asserts/crashes events are captured as usual and it's only the ensures not getting uploaded to Sentry?

@Tobias-Nilsson-Sharkmob
Copy link

Tobias-Nilsson-Sharkmob commented Apr 18, 2024

Heya!

I've dug a bit more, and it seems that ensures do not get sent from Windows or Linux servers on our end.

For crashes/asserts they do get captured properly, sometimes they arrive to Sentry and get labeled < unlabeled event >, which is a new behavior, but they do arrive to Sentry.

What did your testing result in? Are you seeing similar behaviors?
If not, we must have some change that I am not aware of that conflicts with this...

@tustanivsky
Copy link
Collaborator

tustanivsky commented Apr 18, 2024

Hey @Tobias-Nilsson-Sharkmob, I've managed to reproduce this issue with ensures not being uploaded to Sentry too. It looks that UE handles them separately from asserts/crashes bypassing our events-capturing mechanism. However there's still seems to be a way to hook in that process so that ensures can be intercepted and reported properly. I'll experiment with it a bit more and let you know.

@tustanivsky
Copy link
Collaborator

@Tobias-Nilsson-Sharkmob You can try the updated plugin package which contains a fix for that issue with ensures mentioned above.

@Tobias-Nilsson-Sharkmob

Hi!

I've verified that ensures works properly now!
It does however seem the SentrySubsystemDesktop::CaptureEnsure culls one too many frames, I've changed this from 8 to 7 and get the behavior I would expect.

Any reason for ending 1 frame before the function with the ensure?

Thanks for getting this fixed quickly!

@tustanivsky
Copy link
Collaborator

@Tobias-Nilsson-Sharkmob Thank you for the timely feedback! Which Unreal version and platform you're on? I suppose for older UE versions that generic callstack part which we're cutting off could differ so that's something we have to consider separately. I tested this only with UE 5.3 Win/Linux and so far everything seemed to work as expected.

@Tobias-Nilsson-Sharkmob

Hey!
Tested on Windows & Linux for Unreal 5.1.
This might explain the difference, might be good with a code switch depending on the engine version in that case for the final push.

I'll get this running on our Main and see if any issues pop up.

Again, thanks for the help!

@tustanivsky
Copy link
Collaborator

Got it, I'll make some extra tests with different UE versions to see how we can handle this.

@Tobias-Nilsson-Sharkmob
Copy link

Tobias-Nilsson-Sharkmob commented Apr 25, 2024

Hello again!

We've been running this for a while now, and these changes seem to have exposed some underlying issues-
Since ensures can happen from different threads, FSentryOutputDevice::Serialize may create Breadcrumb UObjects via AddBreadcrumbWithParams.
This breaks since there is a chance that the game thread is running garbage collection when this happens triggering "Unable to create new object: %s %s.%s. Creating UObjects while Collecting Garbage is not allowed"
This may also break in SentryScopeDesktop::AddBreadcrumb, since two threads can edit the internal list at the same time.

For our usage, we may simply switch USentryBreadcrumb from an UObject to a UStruct since we have no need for the UFunctions on that type, and add a critical section around the internal list to solve this for now. However, this is likely a larger issue that needs to be considered before these changes are merged.

@Edstub207
Copy link
Contributor Author

I'll add our testing as well, as while Tobias + I are from the same studio, we're on different projects.

We rolled out this change but had to back it out due to it causing a crash which wasn't fatal without the changes - Potentially related to the Multhreaded stuff Tobias mentioned. The crash is in our code, pending a fix, but we intend on taking the changes back in once that's resolved. The brief time we had it available there where no obvious issues, although I would most likely wait for a resolution on the issue raised by Tobias before rolling out.

@tustanivsky
Copy link
Collaborator

@Tobias-Nilsson-Sharkmob @Edstub207 Thanks for the heads up and providing the above details! I'll look into that and try to add some threading considerations

@Edstub207
Copy link
Contributor Author

Hey @tustanivsky any update on this?

@tustanivsky
Copy link
Collaborator

Hey @Edstub207, we will revisit this sometime next week and let you know once there are any updates. Tobias made some great points above regarding how threading issues may be resolved so I'll check if we can adopt these to make things moving faster.
Thank you for your patience.

@tustanivsky
Copy link
Collaborator

@Edstub207 @Tobias-Nilsson-Sharkmob we're looking forward to merging #559 which is supposed to fix the threading issues mentioned earlier. Also, the suggested solution for assertion/ensure events grouping problem now supports Android/Apple and should be available after #537 passes the review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
Archived in project
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants