Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unreal Linux server fails to launch Unhandled Exception: SIGSEGV #259

Closed
damjess opened this issue Apr 19, 2023 · 13 comments · Fixed by #297
Closed

Unreal Linux server fails to launch Unhandled Exception: SIGSEGV #259

damjess opened this issue Apr 19, 2023 · 13 comments · Fixed by #297
Labels
Bug Something isn't working

Comments

@damjess
Copy link

damjess commented Apr 19, 2023

Environment

How do you use Sentry?
Sentry SaaS (sentry.io)

Which version of the SDK?
0.4.0

How did you install the package? (Git-URL, Assetstore)
Git-URL

Which version of Unreal?
5.1.1

Is this happening in Unreal (editor) or on a player like Android, iOS, Windows?
Linux Development Server

Steps to Reproduce

  1. Package Linux Development Server via Development Editor GUI
  2. Launch server in Linux (WSL2 or Hyper-V Ubuntu LTS 22.04) ( .\GameServer.sh)

Expected Result

Server launches without critical error

log:

[2023.04.13-23.59.24:376][  0]LogSentrySdk: using database path "/path/to/project/Saved/StagedBuilds/LinuxServer/Project/.sentry-native"
[2023.04.13-23.59.24:379][  0]LogSentrySdk: starting transport
[2023.04.13-23.59.24:379][  0]LogSentrySdk: starting background worker thread
[2023.04.13-23.59.24:379][  0]LogSentrySdk: starting backend
[2023.04.13-23.59.24:382][  0]LogSentrySdk: background worker thread started
[2023.04.13-23.59.24:382][  0]LogSentrySdk: starting crashpad backend with handler "/path/to/project/Saved/StagedBuilds/LinuxServer/Project/Plugins/Sentry/Binaries/Linux/crashpad_handler"
[2023.04.13-23.59.24:392][  0]LogSentrySdk: using minidump url "https://XXXXXX.ingest.sentry.io:443/api/XXXXXX"
[2023.04.13-23.59.24:402][  0]LogSentrySdk: started crashpad client handler
[2023.04.13-23.59.24:474][  0]LogSentrySdk: processing and pruning old runs
[2023.04.13-23.59.24:482][  0]LogSentrySdk: sending envelope
[2023.04.13-23.59.24:482][  0]LogSentrySdk: submitting task to background worker thread
[2023.04.13-23.59.24:638][  0]LogSentrySdk: executing task on worker thread
[2023.04.13-23.59.24:638][  0]LogSentrySdk: Sentry initialization completed with result 0 (0 on success).

Actual Result

Server crashes with Unhandled Exception: SIGSEGV: unaligned memory access (SIMD vectors?)

log:

[2023.04.13-04.19.36:580][  0]LogSentrySdk: using database path "/path/to/project/Saved/StagedBuilds/LinuxServer/Project/.sentry-native"
[2023.04.13-04.19.36:580][  0]LogSentrySdk: starting transport
[2023.04.13-04.19.36:581][  0]LogSentrySdk: starting background worker thread
[2023.04.13-04.19.36:582][  0]LogSentrySdk: starting backend
[2023.04.13-04.19.36:582][  0]LogSentrySdk: background worker thread started
[2023.04.13-04.19.36:582][  0]LogSentrySdk: starting crashpad backend with handler "/path/to/project/Saved/StagedBuilds/LinuxServer/Project/Plugins/Sentry/Binaries/Linux/crashpad_handler"
[2023.04.13-04.19.36:583][  0]LogSentrySdk: using minidump url "https://XXXXX.ingest.se

(^ this is the actual output, final line is truncated)

shell output:

[2023.04.14-00.50.08:987][  0]LogSentrySdk: using database path "/path/to/project/Saved/StagedBuilds/LinuxServer/Project/.sentry-native"
[2023.04.14-00.50.08:989][  0]LogSentrySdk: starting transport
[2023.04.14-00.50.08:990][  0]LogSentrySdk: starting background worker thread
[2023.04.14-00.50.08:990][  0]LogSentrySdk: starting backend
[2023.04.14-00.50.08:991][  0]LogSentrySdk: background worker thread started
[2023.04.14-00.50.08:991][  0]LogSentrySdk: starting crashpad backend with handler "/path/to/project/Saved/StagedBuilds/LinuxServer/Project/Plugins/Sentry/Binaries/Linux/crashpad_handler"
[2023.04.14-00.50.09:000][  0]LogSentrySdk: using minidump url "https://XXXXX.ingest.sentry.io:443/api/XXXXXX"
[2023.04.14-00.50.09:004][  0]LogSentrySdk: started crashpad client handler
[27:27:20230414,105009.060229:ERROR file_io_posix.cc:152] open /path/to/project/Saved/StagedBuilds/LinuxServer/Project/.sentry-native/completed/7fed5db9-1d92-4486-8782-e13d125bc0a6.lock: File exists (17)
[2023.04.14-00.50.09:101][  0]LogSentrySdk: processing and pruning old runs
[2023.04.14-00.50.09:108][  0]LogSentrySdk: sending envelope
[2023.04.14-00.50.09:108][  0]LogSentrySdk: submitting task to background worker thread
*   Trying XX.XXX.XXX.XXX:443...
* Connected to XXXXXX.ingest.sentry.io (XX.XXX.XXX.XXX) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
[62:62:20230414,105009.134050:ERROR file_io_posix.cc:144] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq: No such file or directory (2)
[62:62:20230414,105009.134087:ERROR file_io_posix.cc:144] open /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq: No such file or directory (2)
Signal 11 caught.
Malloc Size=262146 LargeMemoryPoolOffset=262162
CommonUnixCrashHandler: Signal=11
[2023.04.14-00.50.09:227][  0]LogSentrySdk: executing task on worker thread
[2023.04.14-00.50.09:227][  0]LogSentrySdk: flushing session and queue before crashpad handler
[2023.04.14-00.50.09:227][  0]LogSentrySdk: dumped 1 in-flight envelopes to disk
[2023.04.14-00.50.09:227][  0]LogSentrySdk: handing control over to crashpad
[2023.04.13-04.46.32:529][  0]LogCore: === Critical error: ===
Unhandled Exception: SIGSEGV: unaligned memory access (SIMD vectors?)

[2023.04.13-04.46.32:529][  0]LogCore: Fatal error!

0x00007feffb407218 libcrypto.so.3!OPENSSL_sk_value(+0x18)
0x00007feffb42977a libcrypto.so.3!X509_STORE_add_lookup(+0x29)
0x00007feffb42fcbf libcrypto.so.3!X509_STORE_load_file_ex(+0x2e)
0x00007feffba708c5 libcurl.so.4!UnknownFunction(0x728c4)
0x00007feffba76427 libcurl.so.4!UnknownFunction(0x78426)
0x00007feffba72d1b libcurl.so.4!UnknownFunction(0x74d1a)
0x00007feffba2b086 libcurl.so.4!UnknownFunction(0x2d085)
0x00007feffba44f26 libcurl.so.4!UnknownFunction(0x46f25)
0x00007feffba4741e libcurl.so.4!curl_multi_perform(+0xcd)
0x00007feffba23e53 libcurl.so.4!curl_easy_perform(+0x1a2)
0x00007feffbe0a17d libsentry.so!UnknownFunction(0x2f17c)
0x00007feffbdff312 libsentry.so!UnknownFunction(0x24311)
0x00007feffbb59b43 libc.so.6!UnknownFunction(0x94b42)
0x00007feffbbeba00 libc.so.6!UnknownFunction(0x1269ff)

[2023.04.13-04.46.32:541][  0]LogExit: Executing StaticShutdownAfterError
Engine crash handling finished; re-raising signal 11 for the default handler. Good bye.
Segmentation fault (core dumped)

We've tested the same server launches without error on Ubuntu 20.04 WSL2/Hyper-V VM/AWS EC2, it seems to be related to Ubuntu version -- Specifically 22.04

edit:
Here is some behavior we see on the 20.04 production server that leads to a similar crash after an Assertion failed:

[2023.04.19-23.33.07:782][767]LogOutputDevice: Warning: 

Script Stack (0 frames) :

[2023.04.19-23.33.07:805][767]LogCore: Error: appError called: Assertion failed: RegisteredLevels.Contains(Level) [File:.\../../Project/Plugins/CustomPlugin/Source/CustomPlugin/Private/Systems/CustomSubsystem.cpp] [Line: 122] 


[2023.04.19-23.33.07:805][767]LogSentrySdk: flushing session and queue before crashpad handler
[2023.04.19-23.33.07:805][767]LogSentrySdk: sending envelope
[2023.04.19-23.33.07:805][767]LogSentrySdk: serializing envelope into buffer
[2023.04.19-23.33.07:805][767]LogSentrySdk: handing control over to crashpad
[2023.04.19-23.33.07:817][767]LogCore: === Critical error: ===
Unhandled Exception: SIGSEGV: invalid attempt to write memory at address 0x0000000000000003

[2023.04.19-23.33.07:817][767]LogCore: Assertion failed: RegisteredLevels.Contains(Level) [File:.\../../Project/Plugins/CustomPlugin/Source/CustomPlugin/Private/Systems/CustomSubsystem.cpp] [Line: 122] 

0x0000000005bb3464 LinuxServer!UnknownFunction(0x59b3463)
0x0000000005bb8794 LinuxServer!UnknownFunction(0x59b8793)
0x000000000a3e1817 LinuxServer!UnknownFunction(0xa1e1816)
0x000000000aa0a2c0 LinuxServer!UnknownFunction(0xa80a2bf)
0x0000000004096b29 LinuxServer!UnknownFunction(0x3e96b28)
0x0000000004096fcd LinuxServer!UnknownFunction(0x3e96fcc)
0x000000000bdd4b38 LinuxServer!UnknownFunction(0xbbd4b37)
0x00007fd6c0e39083 libc.so.6!__libc_start_main(+0xf2)
0x000000000408c029 LinuxServer!UnknownFunction(0x3e8c028)



0x0000000005bb3464 LinuxServer!UnknownFunction(0x59b3463)
0x0000000005bb8794 LinuxServer!UnknownFunction(0x59b8793)
0x000000000a3e1817 LinuxServer!UnknownFunction(0xa1e1816)
0x000000000aa0a2c0 LinuxServer!UnknownFunction(0xa80a2bf)
0x0000000004096b29LinuxServer!UnknownFunction(0x3e96b28)
0x0000000004096fcd LinuxServer!UnknownFunction(0x3e96fcc)
0x000000000bdd4b38 LinuxServer!UnknownFunction(0xbbd4b37)
0x00007fd6c0e39083 libc.so.6!__libc_start_main(+0xf2)
0x000000000408c029 LinuxServer!UnknownFunction(0x3e8c028) 
@damjess damjess added Bug Something isn't working Platform: Unreal labels Apr 19, 2023
@damjess
Copy link
Author

damjess commented Apr 26, 2023

Can I provide any additional information? Is anyone available to look at this? We have fully disabled Sentry from our project now because it is too unstable.

@tustanivsky
Copy link
Collaborator

Maybe @Swatinem has any ideas?

There was a similar issue a while ago (#236), though I'm not sure whether it's relevant

@Swatinem
Copy link
Member

The first log indeed looks like related to curl/openssl.

Though the log line LogSentrySdk: handing control over to crashpad means that we should get an event into sentry, as crashpad should send that independently from the SDK.

We shouldn’t have any unaligned loads in Sentry itself, though one can never know with openssl. Maybe @supervacuus has any ideas?

@supervacuus
Copy link

First impression: this reminds me of a recent comment on an older issue: getsentry/sentry-native#337 (comment)

Is it possible that through the sentry-unreal build process (or the Unreal execution environment of which I know very little), we somehow load incompatible OpenSSL versions?

I know that sentry-unreal is built using the toolchain from Ubuntu 20.04 to honor the Unreal ABI requirements and that between 20.04 and 22.04, they switched to OpenSSL 3.0 for the libcurl4 package (and OpenSSL 1.1.1 and 3.0 are binary-incompatible). Still, the Native SDK doesn't depend on any specific SSL implementation anymore, so it must be a build-related issue.

I can create a repro tomorrow.

@supervacuus
Copy link

I tried to reproduce this with clean builds of the Native SDK on EC2 instances (build on Ubuntu 20.04, repro runs on 22.04) but couldn't. Neither the connections in the libsentry.so transport nor those in the crashpad_handler crashed due to any wrongly loaded libraries.

Following the steps in build-linux.sh and the GHA workflow, we can exclude the toolchain or the dev packages available on Ubuntu 20.04 as a root cause for a crash on 22.04.

I am unaware of further build steps in sentry-unreal, which would modify the linker/loader settings in the distributed binary or (more realistic) environmental influence on the loader (LD_LIBRARY_PATH etc.) when things are executed. For instance, is the unreal docker container also part of the distribution, or is this only used for testing?

@supervacuus
Copy link

Another thing that caught my attention: since the log of the crashing boot stops at (or after) the Application-Layer Protocol Negotiation and the crash is in X509_STORE_load_file_ex(), maybe there is a problem with the certificate store.

@damjess
Copy link
Author

damjess commented Apr 28, 2023

In terms of repro, in case it helps. Managed to run through this today:

  1. Removed Sentry plugin entirely
  2. Re-added to project directory
  3. Rebuild engine (we are using custom 5.1.1)
  4. Run Development Editor
  5. Build Linux Development Server target
  6. Start WSL Ubuntu 22.04
  7. Launch server
  8. Server launches successfully
  9. End server process
  10. Launch server
  11. Segmentation fault
[2023.04.28-05.09.47:828][  0]LogSentrySdk: handing control over to crashpad
[2023.04.28-05.09.47:831][  0]LogCore: === Critical error: ===
Unhandled Exception: SIGSEGV: unaligned memory access (SIMD vectors?)

[2023.04.28-05.09.47:831][  0]LogCore: Fatal error!

0x00007fe419ee0218 libcrypto.so.3!OPENSSL_sk_value(+0x18)
0x00007fe419f0277a libcrypto.so.3!X509_STORE_add_lookup(+0x29)
0x00007fe419f08cbf libcrypto.so.3!X509_STORE_load_file_ex(+0x2e)
0x00007fe41a31f8c5 libcurl.so.4!UnknownFunction(0x728c4)
0x00007fe41a325427 libcurl.so.4!UnknownFunction(0x78426)
0x00007fe41a321d1b libcurl.so.4!UnknownFunction(0x74d1a)
0x00007fe41a2da086 libcurl.so.4!UnknownFunction(0x2d085)
0x00007fe41a2f3f26 libcurl.so.4!UnknownFunction(0x46f25)
0x00007fe41a2f641e libcurl.so.4!curl_multi_perform(+0xcd)
0x00007fe41a2d2e53 libcurl.so.4!curl_easy_perform(+0x1a2)
0x00007fe41a8e317d libsentry.so!UnknownFunction(0x2f17c)
0x00007fe41a8d8312 libsentry.so!UnknownFunction(0x24311)
0x00007fe41a634b43 libc.so.6!UnknownFunction(0x94b42)
0x00007fe41a6c6a00 libc.so.6!UnknownFunction(0x1269ff)

[2023.04.28-05.09.47:843][  0]LogExit: Executing StaticShutdownAfterError
Engine crash handling finished; re-raising signal 11 for the default handler. Good bye.
Segmentation fault (core dumped)

@supervacuus
Copy link

@damjess, I just wanted to clarify that my comments are related to the Native SDK, which Sentry's Unreal SDK depends on. I joined the discussion to help determine if there might be an issue with the Native SDK or its integration with the Unreal SDK releases.

It seems that an incompatible OpenSSL version might be loaded during the startup of the Unreal Server environment. However, I cannot confirm this or provide a solution since I cannot reproduce a context similar to yours. I am happy to assist with debugging any low-level problems and suggesting potential fixes. However, to fully understand the requirements of the Unreal deployment scenario, I would need the expertise of the sentry-unreal maintainers.

@tustanivsky
Copy link
Collaborator

Another thing that caught my attention: since the log of the crashing boot stops at (or after) the Application-Layer Protocol Negotiation and the crash is in X509_STORE_load_file_ex(), maybe there is a problem with the certificate store.

That's a good point. I recall we faced problems with CA certs in that regard a while ago (#123), so maybe it's something worth checking @damjess

@Zythiku
Copy link

Zythiku commented May 11, 2023

Also getting the same error and had to disable Linux support. Got the error using Linux WSL2 Ubuntu 22.0.4.2 LTS. Also getting a crash running on amazon linux 2 EC2 Instance

Unreal version 5.1.1
Sentry SDK Version 0.5.0

[2023.05.11-18.29.35:455][  0]LogSentrySdk: started crashpad client handler
[2023.05.11-18.29.35:465][  0]LogSentrySdk: processing and pruning old runs
[2023.05.11-18.29.35:496][  0]LogSentrySdk: sending envelope
[2023.05.11-18.29.35:496][  0]LogSentrySdk: submitting task to background worker thread
[2023.05.11-18.29.35:512][  0]LogSentrySdk: executing task on worker thread
[2023.05.11-18.29.35:512][  0]LogSentrySdk: Sentry initialization completed with result 0 (0 on success).
[2023.05.11-18.29.35:773][  0]LogSentrySdk: flushing session and queue before crashpad handler
[2023.05.11-18.29.35:773][  0]LogSentrySdk: sending envelope
[2023.05.11-18.29.35:773][  0]LogSentrySdk: serializing envelope into buffer
[2023.05.11-18.29.35:773][  0]LogSentrySdk: serializing envelope into buffer
[2023.05.11-18.29.35:773][  0]LogSentrySdk: dumped 1 in-flight envelopes to disk
[2023.05.11-18.29.35:773][  0]LogSentrySdk: handing control over to crashpad
[2023.05.11-18.29.35:783][  0]LogCore: === Critical error: ===
Unhandled Exception: SIGSEGV: unaligned memory access (SIMD vectors?)

[2023.05.11-18.29.35:783][  0]LogCore: Fatal error!

0x00007fe6ba02a218 libcrypto.so.3!OPENSSL_sk_value(+0x18)
0x00007fe6ba04c77a libcrypto.so.3!X509_STORE_add_lookup(+0x29)
0x00007fe6ba052cbf libcrypto.so.3!X509_STORE_load_file_ex(+0x2e)
0x00007fe6ba3fd8c5 libcurl.so.4!UnknownFunction(0x728c4)
0x00007fe6ba403427 libcurl.so.4!UnknownFunction(0x78426)
0x00007fe6ba3ffd1b libcurl.so.4!UnknownFunction(0x74d1a)
0x00007fe6ba3b8086 libcurl.so.4!UnknownFunction(0x2d085)
0x00007fe6ba3d1f26 libcurl.so.4!UnknownFunction(0x46f25)
0x00007fe6ba3d441e libcurl.so.4!curl_multi_perform(+0xcd)
0x00007fe6ba3b0e53 libcurl.so.4!curl_easy_perform(+0x1a2)
0x00007fe6bcc7817d libsentry.so!UnknownFunction(0x2f17c)
0x00007fe6bcc6d312 libsentry.so!UnknownFunction(0x24311)
0x00007fe6ba6f0b43 libc.so.6!UnknownFunction(0x94b42)
0x00007fe6ba782a00 libc.so.6!UnknownFunction(0x1269ff)

[2023.05.11-18.29.35:794][  0]LogExit: Executing StaticShutdownAfterError

@tustanivsky
Copy link
Collaborator

Alright, I've managed to reproduce this error on my end with the Linux Server build when launching it on WSL Ubuntu LTS 22.04 (on WSL Ubuntu 20.04 everything works as expected).

sentry-native which sentry-unreal plugin uses under the hood was successfully initialized and even managed to capture the corresponding crash event.

I've tried switching to older OpenSSL 1.1.1 however that doesn't seem to help.

@supervacuus
Copy link

Thanks, @tustanivsky, that helped a lot.

I was also able to repro this and could verify that there is indeed an OpenSSL version conflict. Specifically, Unreal Engine 5.1 is distributed with libcurl 7.83.1, which is statically linked and uses OpenSSL/1.1.1n as its TLS implementation. As described in my previous comment, Ubuntu moved with version 22.04 to OpenSSL 3.0.2, which will load it together with libsentry.so; now you have two conflicting OpenSSL versions running in the Linux server, which aren't meant to be used from the same process (it is also clearly visible in the debugger that both libraries are initialized).

I quickly built libcurl with OpenSSL 1.1.1n and started the Linux server with LD_LIBRARY_PATH pointing to my build of these libraries, and the server started without crashing. This step just verifies that the OpenSSL conflict is the root-cause and not necessarily meant as a solution to the problem (although it is a viable approach as a quick fix).

Regarding a way towards a solution to the problem: the problem is 100% environmental, we cannot choose/modify the HTTP dependency of the Unreal Server, and we cannot change the version choice of the Ubuntu Maintainers. OpenSSL 1.1.1 is only supported until September 2023, so the problem will worsen as long as Epic stays with 1.1.1 because more Linux distros will switch.

The Native SDK has an API to allow clients to write their own transport, which in the case of sentry-unreal could be implemented in terms of the HTTP Module in the Unreal Runtime (which defers to their statically linked libcurl). This way you can prevent any conflicting dependency from being loaded. I am unaware of the effort this would require or blockers preventing this, but I can certainly support any implementation attempts.

Of course, to state the obvious, the way libsentry.so is built and distributed, we don't depend on any specific TLS implementation and no code would have to be written to fix this problem, but it requires an alternative to the libcurl package currently provided by Ubuntu LTS that the loader can find when the Linux server is started.

sentry-native which sentry-unreal plugin uses under the hood was successfully initialized and even managed to capture the corresponding crash event.

Yeah, this is possible because the crash gets transmitted via the crashpad_handler, which is started as a separate process and thus not affected by the OpenSSL version used in the Unreal Engine.

I've tried switching to older OpenSSL 1.1.1 however that doesn't seem to help.

How did you test this, or rather how did you switch?

@tustanivsky
Copy link
Collaborator

@supervacuus I appreciate you taking the time to research this issue and providing a detailed explanation of its origin and possible workarounds!

Writing our custom transport for the sentry-unreal in order to avoid this situation with conflicting dependencies might be a great solution actually. There are some third-party implementations out there (Example 1, Example 2) that we can use as a reference. I'll take a closer look at this and get back to you if more questions arise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants