Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(native): add backend trade-offs to Advanced Usage #11721

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 69 additions & 0 deletions docs/platforms/native/advanced-usage/backend-tradeoffs/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
title: Backend Tradeoffs
description: "How to choose the right crash backend in the Native SDK."
sidebar_order: 1000
---
The Native SDK lets users decide at compile-time between three crash backends:

* `crashpad`
* `breakpad`
* `inproc`

Currently, `crashpad` is the default on all desktop platforms because it

* has an external `handler` process that allows for external snapshots and sending crash reports immediately (instead of on the next successful start of your application)
* is the primary target for extension compared to upstream, including
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's late so maybe that's why but I don't understand this sentence

specifically "target for extension compared to upstream", what's upstream in this case? and what extension are you referring to?

* client-side stack traces
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry if this is a dumb question, but what does this mean? Is that vs "server-side stack traces"?
Don't we create a minidump with crashpad and hence stack walk on the server?

* attachment handling
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this indicate we don't attachments using other backends?
I believe crashpad only allows specifying attachments during init, is that right? That can be seeing as a limitation. And technically we should be able to support attachments on the other backends (not by the backend themselves, but by the SDK. Hold on to the paths, on restart if the files are there, send them. No?)

* HTTP proxy support
* CMake build scripts
* GCC and MinGW support
* `FirstChanceHandler` on Windows and extension of its synchronization to support Sentry hooks
* cooperation with Epic's Easy Anti Cheat
* supports more error types on [Linux](/platforms/native/advanced-usage/signal-handling/#signals-of-interest) and Windows (`abort()` and other `fast-fail` crashes, handling of heap corruptions)
* is more maintained upstream (although most changes affect new platforms like Fuchsia)

### When shouldn't I use the `crashpad` backend?

Sentry decided on `crashpad` as the default on all platforms because there are a lot of upsides. However, there are use cases where `crashpad` cannot be used or makes distribution or deployment much harder. We provide other backends for situations when

* you cannot package or deploy an additional executable (the `crashpad_handler`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it useful to add 'or spawn'? Given environments where there's a sandbox and we can't create child processes? (Xbox and UWP I believe?)

Suggested change
* you cannot package or deploy an additional executable (the `crashpad_handler`)
* you cannot package, deploy or spawn an additional executable (the `crashpad_handler`)

I see this might be covered in the next two lines though

* you cannot allow a secondary process to connect via `ptrace` to your application (AWS Lambda, Flatpak-, Snap-Sandboxes)
* IPC between your process and the `crashpad_handler` is inhibited by security settings or not available in your deployment target
* your deployment scenario cannot wait for the `crashpad_handler` to finish its work before a shutdown-after-crash (systemd, Docker)
* you want to distribute your application via the macOS App Store
* you want to define crash hooks on macOS, because there, error handling happens entirely in the `crashpad_handler` whereas on Linux and Windows at least the initial handling happens in your process after which `crashpad_handler` takes over and snapshots the process to send a crash report

In the above cases, if you cannot loosen the requirements of your environment, you have to choose an in-process backend (meaning either `breakpad` or `inproc`).

### How do I decide between `breakpad` or `inproc`?

Both backends are comparable in how they differ from `crashpad`. However, there are also considerable differences between the two:

* `inproc` only provides the backtrace of the crashing thread. `breakpad` records all threads in the minidump.
* similar to `crashpad`, `breakpad` uses the lowest level error handling mechanism on each platform (macOS: mach exception ports, Windows: `UnhandledExceptionFilter`, Linux: signal handlers), it does cover a smaller range of errors though as mentioned above.
* `inproc`, on the other hand, uses signal handling on Linux and macOS (meaning you only get a `POSIX` compatibility layer over mach exception ports) and `UnhandledExceptionFilter` on Windows (solely errors registered by SEH)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(solely errors registered by SEH)

This could use further clarification on differences between inproc and breakpad

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to say here that they use the same mechanism, and UnhandledExceptionFilter and SEH have equivalent meanings in this case. I can see how this easily reads as a distinction between breakpad and inproc. Should I remove it?

* as a result of choosing signal handling on macOS, `inproc` is currently broken on macOS since Apple eliminated unwinding from signal handlers
supervacuus marked this conversation as resolved.
Show resolved Hide resolved
* `inproc` is exceptionally lightweight and written entirely in C, meaning it does not rely on a weighty C++ runtime library, which is also more often affected by ABI incompatibilities
* `breakpad` generates minidumps (like `crashpad` does), whereas `inproc` follows the Sentry event structure and primarily defers to the OS-provided unwinder and symbolication capabilities. Sentry can potentially extract more information from the provided minidump than simply a stack trace and registers. However, it also means that the crash context inside the minidump will be opaque until processed in the backend, whereas a local `relay` instance could process an entire `inproc` event if required.

### So when do I choose `inproc`?

`inproc` is currently the backend of choice for `Android` because it allows us to couple it with our own fork of a powerful platform unwinder `libunwindstack` (rather than relying on a user-space interface). This allows us to support very old Android versions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In

Suggested change
`inproc` is currently the backend of choice for `Android` because it allows us to couple it with our own fork of a powerful platform unwinder `libunwindstack` (rather than relying on a user-space interface). This allows us to support very old Android versions.
`inproc` is currently the backend of choice for `Android` because it allows us to couple it with our own fork of a powerful platform unwinder `libunwindstack` (rather than relying on a user-space interface). This allows us to support very old Android versions. In addition, stack walking on device on Android is preferred since we don't have all system symbols available for server-side symbolication. A [best-effort symbol collection exists](https://github.com/getsentry/symbol-collector), but that'll never be as reliable as stackwalking on device.


`inproc` is the right choice if you

* want minimal dependencies
* want the smallest footprint for the resulting artifact
* don't need to support the latest macOS versions
* find the minimal featureset compared to `breakpad` and `crashpad` sufficient for your scenario

### Summary

There are many trade-offs in the selection of backends if you dive into the details. The above merely scratches the surface. Sentry suggests a sequence of evaluations like

* `crashpad` (default)
* `breakpad`
* `inproc`

from most feature-complete to least, where a step down should only be triggered by environmental inhibitors. With the above you now have exemplary decision points for your error reporting scenario.
15 changes: 9 additions & 6 deletions docs/platforms/native/configuration/backends/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,22 @@ and libraries. Similar to plugins, they extend the functionality of the Sentry
SDK.

The Native SDK can use different backends that are responsible for capturing
crashes. The backend is configured at build-time, using the `SENTRY_BACKEND`
crashes. The backend is configured at build-time using the `SENTRY_BACKEND`
CMake option, with support for the following options:

- [`crashpad`](crashpad/): This uses the out-of-process crashpad handler.
It is used as the default on Windows, macOS and Linux.
It is used as the default on Windows, macOS, and Linux.
- `breakpad`: This uses the in-process breakpad handler.
- `inproc`: A small in-process handler which is supported on all platforms,
and is used as default on Android.
- `inproc`: A small in-process handler supported on all platforms
and used as a default on Android. It does no longer work on macOS since version 13 ("Ventura").
supervacuus marked this conversation as resolved.
Show resolved Hide resolved
- `none`: This builds `sentry-native` without a backend, so it does not handle
crashes at all. It is primarily used for tests.
crashes. It is primarily used for tests.

`Breakpad` and `inproc` both run "in-process", so they run in the same process
as the application they capture crashes for. This means these backends can't
send a crash to Sentry immediately after it happens. Instead, the crash report is
written to disk and sent the next time the application is run. Since `Crashpad`
runs in a different process it doesn't have this limitation.
runs in a different process, it doesn't have this limitation.

The [Advanced Usage](/platforms/native/advanced-usage/backend-tradeoffs) section
explains the trade-offs in choosing the correct backend for your use case.
Loading