Fix possible crash or deadlock arising from calling notify() from multiple queues concurrently #401

kattrali · 2019-08-09T17:31:26Z

Goal

Its possible that when notify() is called concurrently from multiple queues, that a crash occurs when caching/resetting crash state. This changeset solves the underlying issues by not sharing state between calls to notify() and isolating calls to suspend threads (for backtrace collection).

Reproduction case

dispatch_queue_t queue1 = dispatch_queue_create("Queue 1", DISPATCH_QUEUE_CONCURRENT);
dispatch_queue_t queue2 = dispatch_queue_create("Queue 2", DISPATCH_QUEUE_CONCURRENT);

for (int i = 0; i < 4; i++) {
    NSString *message = [NSString stringWithFormat:@"Err %ld", (long)i];
    NSError *error = [FooError errorWithDomain:@"com.example"
                                          code:340
                                      userInfo:nil];
    dispatch_async(queue1, ^{
        [Bugsnag notifyError:error];
    });
    dispatch_async(queue2, ^{
        [Bugsnag notifyError:error];
    });
}

Fixes #399

Changeset

The changes can be divided into four parts:

Adding a (failing on master) test which calls notify() several times from different concurrent queues
Refactoring crash reporting internals to allow providing separate crash contexts for each invocation of notify()
Implementing separate crash contexts for notify() calls
Adding locking around suspending threads and writing crash state.

The easiest way to view the changeset is one commit at a time, since each chunk can be evaluated independently.

Tests

Tested manually on a couple different iOS/tvOS devices running iOS 11 & 12
Added automated tests replicating the failing scenario

Removed inlining/static to be compatible with C++ handler

Allows the system to have multiple crash report paths for different contexts

Allows each report generator to use a different context if needed

While not async signal safe, this interface is suitable for generating IDs on the fly when calling reportUserException().

This change allows each notify request to specify its own crash ID and file path, avoiding cases where two notify calls either reuse the same file path or stomp on each other when trying to set the "next" ID/path.

Fixes #240

This avoids multiple threads from notify() stomping on this file

tomlongridge

LGTM 👍

One suggestion for a unit test to cover new function (if feasible).

tomlongridge · 2019-08-13T07:52:49Z

Tests/KSCrash/KSCrashIdentifierTests.m

+@interface KSCrashIdentifierTests : XCTestCase
+@end
+
+@implementation KSCrashIdentifierTests


Can we include a test for bsg_kscrash_generate_report_path?

Yeah I should - it turned out to be a bit tricky to stub the file store but I've added a task to improve coverage here.

nkavian · 2019-08-14T01:41:13Z

When will a new release be ready? Thanks.

kattrali · 2019-08-14T16:10:48Z

Hi @nkavian! Doing a release now, should be available soon.

nkavian · 2019-08-15T04:43:45Z

Thanks, confirming the latest release fixed my issue!

In the new Xcode 10 build system, Swift object register values have the top bit used as a flag. This change strips the flag while not losing anything relevant to us in our quest to see error messages for assertion failures. This technique does not capture messages which are less than 16 characters, as short strings are stored as raw char arrays on the stack rather than being allocated. (See WWDC 2018 bugsnag#401 for more info on new string optimizations) While it is possible to check for char arrays as well as pointers when searching for notable address values, sweeping up local variables has a likely chance of capturing unintended data as well from the surrounding code, some of which may be sensitive. It is also not guaranteed that the value would still be on the stack after the message is logged, so it is possible to get only unrelated string values as the message. In the current Swift stdlib, the following messages passed to fatalError, preconditionFailure, and precondition (and their internal func counterparts) are less than 16 characters: * empty string * `unavailable` * `not implemented` * `abstract method` * `unknown value` * `invalid count` (where a dictionary contains < 0 items(?)) * `invalid index` (where a dictionary ceases to be a dictionary) * `don't touch me` (from SpriteKit) * `close() failed` (from the private Subprocess implementation) The vast majority have more meaningful messages. Reference: * https://asciiwwdc.com/2018/sessions/401 Fixes bugsnag#318

kattrali added 9 commits August 9, 2019 18:12

tests: Add highly concurrent notify() case

401e5ce

refactor: Make the current crash context available to sentries

4697b22

Removed inlining/static to be compatible with C++ handler

refactor: Move crash report paths into context object

4351a46

Allows the system to have multiple crash report paths for different contexts

refactor: Provide context as an argument to the crash report writer

7d43d0c

Allows each report generator to use a different context if needed

feat: Add C interface to generating new crash IDs and file paths

2c810d5

While not async signal safe, this interface is suitable for generating IDs on the fly when calling reportUserException().

fix: Use a copy of the crash context for reports generated from notify

89798fc

This change allows each notify request to specify its own crash ID and file path, avoiding cases where two notify calls either reuse the same file path or stomp on each other when trying to set the "next" ID/path.

fix: Disallow multiple threads from suspending all other threads at once

9153ef2

Fixes #240

fix: Only save crash state if a crash actually occurred

193c445

This avoids multiple threads from notify() stomping on this file

docs(changes): Add entry for concurrency improvements

6b2a46e

kattrali mentioned this pull request Aug 9, 2019

SIGABRT in notifyError #399

Closed

tomlongridge approved these changes Aug 13, 2019

View reviewed changes

kattrali merged commit c3d5975 into master Aug 14, 2019

kattrali deleted the kattrali/use-separate-reporting-context-in-notify branch August 14, 2019 16:07

kattrali mentioned this pull request Aug 14, 2019

NSInvalidArgumentException in notifyError #402

Closed

kattrali mentioned this pull request Jan 2, 2020

fix(utils): Remove global buffer used when writing files to disk #442

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix possible crash or deadlock arising from calling notify() from multiple queues concurrently #401

Fix possible crash or deadlock arising from calling notify() from multiple queues concurrently #401

kattrali commented Aug 9, 2019 •

edited

Loading

tomlongridge left a comment

tomlongridge Aug 13, 2019

kattrali Aug 14, 2019

nkavian commented Aug 14, 2019

kattrali commented Aug 14, 2019

nkavian commented Aug 15, 2019

Fix possible crash or deadlock arising from calling notify() from multiple queues concurrently #401

Fix possible crash or deadlock arising from calling notify() from multiple queues concurrently #401

Conversation

kattrali commented Aug 9, 2019 • edited Loading

Goal

Reproduction case

Changeset

Tests

tomlongridge left a comment

Choose a reason for hiding this comment

tomlongridge Aug 13, 2019

Choose a reason for hiding this comment

kattrali Aug 14, 2019

Choose a reason for hiding this comment

nkavian commented Aug 14, 2019

kattrali commented Aug 14, 2019

nkavian commented Aug 15, 2019

kattrali commented Aug 9, 2019 •

edited

Loading