Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform CIL=>LLVM translation using a symbolic execution engine. #24

Merged
merged 3 commits into from
Jul 27, 2018

Conversation

mizho
Copy link
Collaborator

@mizho mizho commented Jul 26, 2018

No description provided.

@lewurm
Copy link
Collaborator

lewurm commented Jul 26, 2018

hey @mizho, CILSymbolicExecutor.cs and OpResultTypeLookup.cs look pretty good! LLVM/BitCodeEmitter.cs is a bit of an overlap with class Builder from BigStep.cs, right?

CILSymbolicExecutor is still unused. How do you want to go forward from this?

  • do you wanna write some basic tests?
  • how do you wanna integrate it into the BigStep compiler?
  • do you want to get it merged first?

LLVM.InitializeX86TargetInfo();
LLVM.InitializeX86AsmParser();
LLVM.InitializeX86AsmPrinter();
LLVM.InitializeMCJITCompilerOptions(s_options);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this only gives you x86 (32bit), we really want amd64 (64bit) code generated. so far, the generated code happens to be simple enough that it doesn't matter.

that's how you do it:
ebe23b0


private LLVMValueRef[] argAddrs;
private LLVMValueRef[] localAddrs;
private Dictionary<string, LLVMValueRef> temps;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Longer-term I kind of wish these stringly-keyed dictionaries were strongly typed.

private LLVMValueRef[] argAddrs;
private LLVMValueRef[] localAddrs;
private Dictionary<string, LLVMValueRef> temps;
private Dictionary<string, LLVMBasicBlockRef> bbs;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here the key shouldn't be a string, long term.

Copy link
Owner

@lambdageek lambdageek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I like the split between the Symbolic Executor and the LLVM backend. That said, the LLVM backend could probably be a little cleaner if it separated the code that just deals with emitting instructions from the LLVM state management and from the translation state management (ie like the BigStep classic Builder + Env classes). But this is great already.

@lambdageek lambdageek merged commit c33903e into mjit Jul 27, 2018
lambdageek pushed a commit that referenced this pull request Jun 25, 2019
We recently switched the AOT compiler for iOS armv7 from 32bit host to 64bit host. At the same time we _also_ switched from LLVM 3.6 to the newer LLVM 6.0 fork we support.

In a Xamarin.iOS Release build for iOS armv7 exception handling would crash. Specifically the problem was that the exception object wasn't properly injected into the handler block. The way it works is, that the exception handler machinery passes the pointer to the exception object via `r0`.

With the newer LLVM the generated code of the exception handling block
looks like this:
```
LBB1_15:                                @ %EH_CLAUSE0_BB3
Ltmp8:
	ldr	r0, [sp, #24]
	str	r0, [sp, #4]
	ldr	r0, [sp, #28]
	ldr	r0, [sp, #12]
	ldr	r0, [r0]
	cmp	r0, #0
	beq	LBB1_17
```
thus overwriting `r0`, so that the pointer to the exception object is lost.  With the older LLVM 3.6 the generated code looks like this:
```
LBB1_8:                                 @ %EH_CLAUSE0_BB3
Ltmp8:
	str	r0, [sp, #4]
	ldr	r0, [r6]
	cmp	r0, #0
	beq	LBB1_10
```
correctly storing the exception object into a stack slot for later usage.

After some time I figured out that there are _different_ exception handling models. Depending on the model, the exception pointer is passed in `r0` or not:
https://github.com/mono/llvm/blob/2bd2f1db1803f7b36687e5abf912c69baa848305/lib/Target/ARM/ARMISelLowering.cpp#L14416-L14428

`SjLj` stands for "SetJump / LongJump" and  means that exception handling is implemented with this. On iOS this is the default model that is used. We want `dwarf` instead, so we tell this `llc` now and the generated code handles it correctly.

So why was it working before? In our older LLVM 3.6 fork we hardcoded it:
https://github.com/mono/llvm/blob/f80899cb3eb75f7f5640b4519e83bd96991bffb8/lib/Target/ARM/ARMISelLowering.cpp#L756-L762

The `-exception-model=` option is also not available in LLVM 3.6.

Contributes to mono#15058 and mono#9621
lambdageek pushed a commit that referenced this pull request Aug 19, 2019
We recently switched the AOT compiler for iOS armv7 from 32bit host to 64bit host. At the same time we _also_ switched from LLVM 3.6 to the newer LLVM 6.0 fork we support.

In a Xamarin.iOS Release build for iOS armv7 exception handling would crash. Specifically the problem was that the exception object wasn't properly injected into the handler block. The way it works is, that the exception handler machinery passes the pointer to the exception object via `r0`.

With the newer LLVM the generated code of the exception handling block
looks like this:
```
LBB1_15:                                @ %EH_CLAUSE0_BB3
Ltmp8:
	ldr	r0, [sp, #24]
	str	r0, [sp, #4]
	ldr	r0, [sp, #28]
	ldr	r0, [sp, #12]
	ldr	r0, [r0]
	cmp	r0, #0
	beq	LBB1_17
```
thus overwriting `r0`, so that the pointer to the exception object is lost.  With the older LLVM 3.6 the generated code looks like this:
```
LBB1_8:                                 @ %EH_CLAUSE0_BB3
Ltmp8:
	str	r0, [sp, #4]
	ldr	r0, [r6]
	cmp	r0, #0
	beq	LBB1_10
```
correctly storing the exception object into a stack slot for later usage.

After some time I figured out that there are _different_ exception handling models. Depending on the model, the exception pointer is passed in `r0` or not:
https://github.com/mono/llvm/blob/2bd2f1db1803f7b36687e5abf912c69baa848305/lib/Target/ARM/ARMISelLowering.cpp#L14416-L14428

`SjLj` stands for "SetJump / LongJump" and  means that exception handling is implemented with this. On iOS this is the default model that is used. We want `dwarf` instead, so we tell this `llc` now and the generated code handles it correctly.

So why was it working before? In our older LLVM 3.6 fork we hardcoded it:
https://github.com/mono/llvm/blob/f80899cb3eb75f7f5640b4519e83bd96991bffb8/lib/Target/ARM/ARMISelLowering.cpp#L756-L762

The `-exception-model=` option is also not available in LLVM 3.6.

Contributes to mono#15058 and mono#9621
lambdageek added a commit that referenced this pull request Sep 26, 2019
)

* [threads] clear small_id_key TLS when unregistering a thread

Fixes
```
* thread #12, name = 'tid_a507', stop reason = EXC_BREAKPOINT (code=1, subcode=0x1be66144)
  * frame #0: 0x1be66144 libsystem_c.dylib`__abort + 184
    frame #1: 0x1be6608c libsystem_c.dylib`abort + 152
    frame #2: 0x003e1fa0 monotouchtest`log_callback(log_domain=0x00000000, log_level="error", message="* Assertion at ../../../../../mono/utils/hazard-pointer.c:158, condition `mono_bitset_test_fast (small_id_table, id)' not met\n", fatal=4, user_data=0x00000000) at runtime.m:1251:3
    frame #3: 0x003abf44 monotouchtest`monoeg_g_logv_nofree(log_domain=0x00000000, log_level=G_LOG_LEVEL_ERROR, format=<unavailable>, args=<unavailable>) at goutput.c:149:2 [opt]
    frame #4: 0x003abfb4 monotouchtest`monoeg_assertion_message(format=<unavailable>) at goutput.c:184:22 [opt]
    frame #5: 0x003904dc monotouchtest`mono_thread_small_id_free(id=<unavailable>) at hazard-pointer.c:0:2 [opt]
    frame #6: 0x003a0a74 monotouchtest`unregister_thread(arg=0x15c88400) at mono-threads.c:588:2 [opt]
    frame #7: 0x00336110 monotouchtest`mono_thread_detach_if_exiting at threads.c:1571:4 [opt]
    frame #8: 0x003e7a14 monotouchtest`::xamarin_release_trampoline(self=0x166452f0, sel="release") at trampolines.m:644:3
    frame #9: 0x001cdc40 monotouchtest`::-[__Xamarin_NSTimerActionDispatcher release](self=0x166452f0, _cmd="release") at registrar.m:83445:3
    frame #10: 0x1ce2ae68 Foundation`_timerRelease + 80
    frame #11: 0x1c31b56c CoreFoundation`CFRunLoopTimerInvalidate + 612
    frame #12: 0x1c31a554 CoreFoundation`__CFRunLoopTimerDeallocate + 32
    frame #13: 0x1c31dde4 CoreFoundation`_CFRelease + 220
    frame #14: 0x1c2be6e8 CoreFoundation`__CFArrayReleaseValues + 500
    frame #15: 0x1c2be4c4 CoreFoundation`CFArrayRemoveAllValues + 104
    frame #16: 0x1c31ff64 CoreFoundation`__CFSetApplyFunction_block_invoke + 24
    frame #17: 0x1c3b2e18 CoreFoundation`CFBasicHashApply + 116
    frame #18: 0x1c31ff10 CoreFoundation`CFSetApplyFunction + 160
    frame #19: 0x1c3152cc CoreFoundation`__CFRunLoopDeallocate + 204
    frame #20: 0x1c31dde4 CoreFoundation`_CFRelease + 220
    frame #21: 0x1c304a80 CoreFoundation`__CFTSDFinalize + 144
    frame #22: 0x1bfa324c libsystem_pthread.dylib`_pthread_tsd_cleanup + 644
    frame #23: 0x1bf9cc08 libsystem_pthread.dylib`_pthread_exit + 80
    frame #24: 0x1bf9af24 libsystem_pthread.dylib`pthread_exit + 36
    frame #25: 0x0039df84 monotouchtest`mono_threads_platform_exit(exit_code=<unavailable>) at mono-threads-posix.c:145:2 [opt]
    frame #26: 0x0033bb84 monotouchtest`start_wrapper(data=0x1609e1c0) at threads.c:1296:2 [opt]
    frame #27: 0x1bf9b914 libsystem_pthread.dylib`_pthread_body + 128
    frame #28: 0x1bf9b874 libsystem_pthread.dylib`_pthread_start + 44
    frame #29: 0x1bfa3b94 libsystem_pthread.dylib`thread_start + 4
```

* Update mono/utils/mono-threads.c

Co-Authored-By: Aleksey Kliger (λgeek) <[email protected]>
lambdageek pushed a commit that referenced this pull request Oct 11, 2019
mono#17015)

[2019-08] [threads] clear small_id_key TLS when unregistering a thread

Fixes
```
* thread #12, name = 'tid_a507', stop reason = EXC_BREAKPOINT (code=1, subcode=0x1be66144)
  * frame #0: 0x1be66144 libsystem_c.dylib`__abort + 184
    frame #1: 0x1be6608c libsystem_c.dylib`abort + 152
    frame #2: 0x003e1fa0 monotouchtest`log_callback(log_domain=0x00000000, log_level="error", message="* Assertion at ../../../../../mono/utils/hazard-pointer.c:158, condition `mono_bitset_test_fast (small_id_table, id)' not met\n", fatal=4, user_data=0x00000000) at runtime.m:1251:3
    frame #3: 0x003abf44 monotouchtest`monoeg_g_logv_nofree(log_domain=0x00000000, log_level=G_LOG_LEVEL_ERROR, format=<unavailable>, args=<unavailable>) at goutput.c:149:2 [opt]
    frame #4: 0x003abfb4 monotouchtest`monoeg_assertion_message(format=<unavailable>) at goutput.c:184:22 [opt]
    frame #5: 0x003904dc monotouchtest`mono_thread_small_id_free(id=<unavailable>) at hazard-pointer.c:0:2 [opt]
    frame #6: 0x003a0a74 monotouchtest`unregister_thread(arg=0x15c88400) at mono-threads.c:588:2 [opt]
    frame #7: 0x00336110 monotouchtest`mono_thread_detach_if_exiting at threads.c:1571:4 [opt]
    frame #8: 0x003e7a14 monotouchtest`::xamarin_release_trampoline(self=0x166452f0, sel="release") at trampolines.m:644:3
    frame #9: 0x001cdc40 monotouchtest`::-[__Xamarin_NSTimerActionDispatcher release](self=0x166452f0, _cmd="release") at registrar.m:83445:3
    frame #10: 0x1ce2ae68 Foundation`_timerRelease + 80
    frame #11: 0x1c31b56c CoreFoundation`CFRunLoopTimerInvalidate + 612
    frame #12: 0x1c31a554 CoreFoundation`__CFRunLoopTimerDeallocate + 32
    frame #13: 0x1c31dde4 CoreFoundation`_CFRelease + 220
    frame #14: 0x1c2be6e8 CoreFoundation`__CFArrayReleaseValues + 500
    frame #15: 0x1c2be4c4 CoreFoundation`CFArrayRemoveAllValues + 104
    frame #16: 0x1c31ff64 CoreFoundation`__CFSetApplyFunction_block_invoke + 24
    frame #17: 0x1c3b2e18 CoreFoundation`CFBasicHashApply + 116
    frame #18: 0x1c31ff10 CoreFoundation`CFSetApplyFunction + 160
    frame #19: 0x1c3152cc CoreFoundation`__CFRunLoopDeallocate + 204
    frame #20: 0x1c31dde4 CoreFoundation`_CFRelease + 220
    frame #21: 0x1c304a80 CoreFoundation`__CFTSDFinalize + 144
    frame #22: 0x1bfa324c libsystem_pthread.dylib`_pthread_tsd_cleanup + 644
    frame #23: 0x1bf9cc08 libsystem_pthread.dylib`_pthread_exit + 80
    frame #24: 0x1bf9af24 libsystem_pthread.dylib`pthread_exit + 36
    frame #25: 0x0039df84 monotouchtest`mono_threads_platform_exit(exit_code=<unavailable>) at mono-threads-posix.c:145:2 [opt]
    frame #26: 0x0033bb84 monotouchtest`start_wrapper(data=0x1609e1c0) at threads.c:1296:2 [opt]
    frame #27: 0x1bf9b914 libsystem_pthread.dylib`_pthread_body + 128
    frame #28: 0x1bf9b874 libsystem_pthread.dylib`_pthread_start + 44
    frame #29: 0x1bfa3b94 libsystem_pthread.dylib`thread_start + 4
```


Debugged by @lambdageek. Contributes to mono#10641

Edit: C/p into PR description so it's in the commit message:

We think what happens is:

1. `start_wrapper` runs `unregister_thread` when the thread is exiting.  At that point the `small_id` for the thread is cleared from the hazard pointer bitset by `mono_thread_small_id_free (small_id)`

2. Some other TLS destructor runs and attaches the thread again (which runs `mono_thread_info_register_small_id` which first calls `mono_thread_info_get_small_id` which tries to get the small id from the `small_id_key` TLS key and so the new `MonoThreadInfo` has the same `small_id` as the previously destroyed `MonoThreadInfo` - but the hazard pointer bitset is **not** updated).

3. This other TLS destructor runs and calls `mono_thread_detach_if_exiting` which calls `unregister_thread` again.

4. `unregister_thread` calls `	mono_thread_small_id_free (small_id)` a second time which asserts because we already cleared that id from the hazard pointer bitset.



Backport of mono#16973.

/cc @lewurm
lambdageek pushed a commit that referenced this pull request Dec 11, 2019
When the debugger tries to suspend all the VM threads, it can happen with cooperative-suspend that it tries to suspend a thread that has previously been attached, but then did a "light" detach (that only unsets the domain). With the domain set to `NULL`, looking up a JitInfo for a given `ip` will result into a crash, but it isn't even necessarily needed for the purpose of suspending a thread.

More details: Consider the following:
```
(lldb) c
error: Process is running.  Use 'process interrupt' to pause execution.
Process 12832 stopped
* thread #9, name = 'tid_a31f', queue = 'NSOperationQueue 0x8069b670 (QOS: UTILITY)', stop reason = breakpoint 1.1
    frame #0: 0x00222870 WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical(info=0x81365400, user_data=0xb04dc718) at debugger-agent.c:2716:6
   2713         MonoDomain *domain = (MonoDomain *) mono_thread_info_get_suspend_state (info)->unwind_data [MONO_UNWIND_DATA_DOMAIN];
   2714         if (!domain) {
   2715                 /* not attached */
-> 2716                 ji = NULL;
   2717         } else {
   2718                 ji = mono_jit_info_table_find_internal ( domain, MONO_CONTEXT_GET_IP (&mono_thread_info_get_suspend_state (info)->ctx), TRUE, TRUE);
   2719         }
Target 0: (WatchWCSessionAppWatchOSExtension) stopped.
(lldb) bt
* thread #9, name = 'tid_a31f', queue = 'NSOperationQueue 0x8069b670 (QOS: UTILITY)', stop reason = breakpoint 1.1
  * frame #0: 0x00222870 WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical(info=0x81365400, user_data=0xb04dc718) at debugger-agent.c:2716:6
    frame #1: 0x004e177b WatchWCSessionAppWatchOSExtension`mono_thread_info_safe_suspend_and_run(id=0xb0767000, interrupt_kernel=0, callback=(WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical at debugger-agent.c:2708), user_data=0xb04dc718) at mono-threads.c:1358:19
    frame #2: 0x00222799 WatchWCSessionAppWatchOSExtension`notify_thread(key=0x03994508, value=0x81378c00, user_data=0x00000000) at debugger-agent.c:2747:2
    frame #3: 0x00355bd8 WatchWCSessionAppWatchOSExtension`mono_g_hash_table_foreach(hash=0x8007b4e0, func=(WatchWCSessionAppWatchOSExtension`notify_thread at debugger-agent.c:2733), user_data=0x00000000) at mono-hash.c:310:4
    frame #4: 0x0021e955 WatchWCSessionAppWatchOSExtension`suspend_vm at debugger-agent.c:2844:3
    frame #5: 0x00225ff0 WatchWCSessionAppWatchOSExtension`process_event(event=EVENT_KIND_THREAD_START, arg=0x039945d0, il_offset=0, ctx=0x00000000, events=0x805b28e0, suspend_policy=2) at debugger-agent.c:4012:3
    frame #6: 0x00227c7c WatchWCSessionAppWatchOSExtension`process_profiler_event(event=EVENT_KIND_THREAD_START, arg=0x039945d0) at debugger-agent.c:4072:2
    frame #7: 0x0021b174 WatchWCSessionAppWatchOSExtension`thread_startup(prof=0x00000000, tid=2957889536) at debugger-agent.c:4149:2
    frame #8: 0x0037912d WatchWCSessionAppWatchOSExtension`mono_profiler_raise_thread_started(tid=2957889536) at profiler-events.h:103:1
    frame #9: 0x003d53da WatchWCSessionAppWatchOSExtension`fire_attach_profiler_events(tid=0xb04dd000) at threads.c:1120:2
    frame #10: 0x003d4d83 WatchWCSessionAppWatchOSExtension`mono_thread_attach(domain=0x801750a0) at threads.c:1547:2
    frame #11: 0x003df1a1 WatchWCSessionAppWatchOSExtension`mono_threads_attach_coop_internal(domain=0x801750a0, cookie=0xb04dcc0c, stackdata=0xb04dcba8) at threads.c:6008:3
    frame #12: 0x003df287 WatchWCSessionAppWatchOSExtension`mono_threads_attach_coop(domain=0x00000000, dummy=0xb04dcc0c) at threads.c:6045:9
    frame #13: 0x005034b8 WatchWCSessionAppWatchOSExtension`::xamarin_switch_gchandle(self=0x80762c20, to_weak=false) at runtime.m:1805:2
    frame #14: 0x005065c1 WatchWCSessionAppWatchOSExtension`::xamarin_retain_trampoline(self=0x80762c20, sel="retain") at trampolines.m:693:2
    frame #15: 0x657ea520 libobjc.A.dylib`objc_retain + 64
    frame #16: 0x4b4d9caa WatchConnectivity`__66-[WCSession onqueue_handleDictionaryMessageRequest:withPairingID:]_block_invoke + 279
    frame #17: 0x453c7df7 Foundation`__NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__ + 12
    frame #18: 0x453c7cf4 Foundation`-[NSBlockOperation main] + 88
    frame #19: 0x453cacee Foundation`__NSOPERATION_IS_INVOKING_MAIN__ + 27
    frame #20: 0x453c6ebd Foundation`-[NSOperation start] + 835
    frame #21: 0x453cb606 Foundation`__NSOPERATIONQUEUE_IS_STARTING_AN_OPERATION__ + 27
    frame #22: 0x453cb12e Foundation`__NSOQSchedule_f + 194
    frame #23: 0x453cb26e Foundation`____addOperations_block_invoke_4 + 20
    frame #24: 0x65de007b libdispatch.dylib`_dispatch_call_block_and_release + 15
    frame #25: 0x65de126f libdispatch.dylib`_dispatch_client_callout + 14
    frame #26: 0x65de3788 libdispatch.dylib`_dispatch_continuation_pop + 421
    frame #27: 0x65de2ee3 libdispatch.dylib`_dispatch_async_redirect_invoke + 818
    frame #28: 0x65df087d libdispatch.dylib`_dispatch_root_queue_drain + 354
    frame #29: 0x65df0ff3 libdispatch.dylib`_dispatch_worker_thread2 + 109
    frame #30: 0x66024fa0 libsystem_pthread.dylib`_pthread_wqthread + 208
    frame #31: 0x66024e44 libsystem_pthread.dylib`start_wqthread + 36
```

Going further, `info` is about this thread:
```
(lldb) p/x *(int *)(((char *) info->node.key) + 0xa0)
(int) $2 = 0x01243f93
(lldb) thread list
Process 12832 stopped
  thread #1: tid = 0x1243ee1, 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10, name = 'tid_303', queue = 'com.apple.main-thread'
  thread #2: tid = 0x1243ee6, 0x65f816e2 libsystem_kernel.dylib`__recvfrom + 10
  thread #3: tid = 0x1243ee7, 0x65f81aea libsystem_kernel.dylib`__psynch_cvwait + 10, name = 'SGen worker'
  thread #4: tid = 0x1243ee9, 0x65f7e3d2 libsystem_kernel.dylib`semaphore_wait_trap + 10, name = 'Finalizer'
  thread #5: tid = 0x1243eea, 0x65f816e2 libsystem_kernel.dylib`__recvfrom + 10, name = 'Debugger agent'
  thread #6: tid = 0x1243f1d, 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10, name = 'com.apple.uikit.eventfetch-thread'
  thread #8: tid = 0x1243f93, 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10, name = 'tid_6d0f', queue = 'NSOperationQueue 0x8069b300 (QOS: UTILITY)'
* thread #9: tid = 0x12443a9, 0x00222870 WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical(info=0x81365400, user_data=0xb04dc718) at debugger-agent.c:2716:6, name = 'tid_a31f', queue = 'NSOperationQueue 0x8069b670 (QOS: UTILITY)', stop reason = breakpoint 1.1
  thread #10: tid = 0x1244581, 0x65f7fd32 libsystem_kernel.dylib`__workq_kernreturn + 10
(lldb) thread select 8
* thread #8, name = 'tid_6d0f', queue = 'NSOperationQueue 0x8069b300 (QOS: UTILITY)'
    frame #0: 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10
libsystem_kernel.dylib`mach_msg_trap:
->  0x65f7e396 <+10>: retl
    0x65f7e397 <+11>: nop

libsystem_kernel.dylib`mach_msg_overwrite_trap:
    0x65f7e398 <+0>:  movl   $0xffffffe0, %eax         ; imm = 0xFFFFFFE0
    0x65f7e39d <+5>:  calll  0x65f85f44                ; _sysenter_trap
(lldb) bt
* thread #8, name = 'tid_6d0f', queue = 'NSOperationQueue 0x8069b300 (QOS: UTILITY)'
  * frame #0: 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10
    frame #1: 0x65f7e8ff libsystem_kernel.dylib`mach_msg + 47
    frame #2: 0x66079679 libxpc.dylib`_xpc_send_serializer + 104
    frame #3: 0x660794da libxpc.dylib`_xpc_pipe_simpleroutine + 80
    frame #4: 0x66079852 libxpc.dylib`xpc_pipe_simpleroutine + 43
    frame #5: 0x66043a8f libsystem_trace.dylib`___os_activity_stream_reflect_block_invoke_2 + 30
    frame #6: 0x65de126f libdispatch.dylib`_dispatch_client_callout + 14
    frame #7: 0x65de3d71 libdispatch.dylib`_dispatch_block_invoke_direct + 257
    frame #8: 0x65de3c62 libdispatch.dylib`dispatch_block_perform + 112
    frame #9: 0x6604349a libsystem_trace.dylib`_os_activity_stream_reflect + 725
    frame #10: 0x6604ef19 libsystem_trace.dylib`_os_log_impl_stream + 468
    frame #11: 0x6604e44d libsystem_trace.dylib`_os_log_impl_flatten_and_send + 6410
    frame #12: 0x6604cb3b libsystem_trace.dylib`_os_log + 137
    frame #13: 0x6604f4aa libsystem_trace.dylib`_os_log_impl + 31
    frame #14: 0x4b4eb4e9 WatchConnectivity`WCSerializePayloadDictionary + 393
    frame #15: 0x4b4d7c4d WatchConnectivity`-[WCSession onqueue_sendResponseDictionary:identifier:] + 195
    frame #16: 0x4b4da435 WatchConnectivity`__66-[WCSession onqueue_handleDictionaryMessageRequest:withPairingID:]_block_invoke.411 + 35
    frame #17: 0x453c7df7 Foundation`__NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__ + 12
    frame #18: 0x453c7cf4 Foundation`-[NSBlockOperation main] + 88
    frame #19: 0x453cacee Foundation`__NSOPERATION_IS_INVOKING_MAIN__ + 27
    frame #20: 0x453c6ebd Foundation`-[NSOperation start] + 835
    frame #21: 0x453cb606 Foundation`__NSOPERATIONQUEUE_IS_STARTING_AN_OPERATION__ + 27
    frame #22: 0x453cb12e Foundation`__NSOQSchedule_f + 194
    frame #23: 0x453cb067 Foundation`____addOperations_block_invoke_2 + 20
    frame #24: 0x65dedf49 libdispatch.dylib`_dispatch_block_async_invoke2 + 77
    frame #25: 0x65de4461 libdispatch.dylib`_dispatch_block_async_invoke_and_release + 17
    frame #26: 0x65de126f libdispatch.dylib`_dispatch_client_callout + 14
    frame #27: 0x65de3788 libdispatch.dylib`_dispatch_continuation_pop + 421
    frame #28: 0x65de2ee3 libdispatch.dylib`_dispatch_async_redirect_invoke + 818
    frame #29: 0x65df087d libdispatch.dylib`_dispatch_root_queue_drain + 354
    frame #30: 0x65df0ff3 libdispatch.dylib`_dispatch_worker_thread2 + 109
    frame #31: 0x66024fa0 libsystem_pthread.dylib`_pthread_wqthread + 208
    frame #32: 0x66024e44 libsystem_pthread.dylib`start_wqthread + 36
```
which is a thread in a "light" detach state (aka. coop detach), where we only unset the domain:
https://github.com/mono/mono/blob/4cefdcb7ce2d939ee78fb45d1b4913eb3bc064fd/mono/metadata/threads.c#L6084-L6111

Fixes mono#17926
lambdageek pushed a commit that referenced this pull request Mar 10, 2021
When the debugger tries to suspend all the VM threads, it can happen with cooperative-suspend that it tries to suspend a thread that has previously been attached, but then did a "light" detach (that only unsets the domain). With the domain set to `NULL`, looking up a JitInfo for a given `ip` will result into a crash, but it isn't even necessarily needed for the purpose of suspending a thread.

More details: Consider the following:
```
(lldb) c
error: Process is running.  Use 'process interrupt' to pause execution.
Process 12832 stopped
* thread #9, name = 'tid_a31f', queue = 'NSOperationQueue 0x8069b670 (QOS: UTILITY)', stop reason = breakpoint 1.1
    frame #0: 0x00222870 WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical(info=0x81365400, user_data=0xb04dc718) at debugger-agent.c:2716:6
   2713         MonoDomain *domain = (MonoDomain *) mono_thread_info_get_suspend_state (info)->unwind_data [MONO_UNWIND_DATA_DOMAIN];
   2714         if (!domain) {
   2715                 /* not attached */
-> 2716                 ji = NULL;
   2717         } else {
   2718                 ji = mono_jit_info_table_find_internal ( domain, MONO_CONTEXT_GET_IP (&mono_thread_info_get_suspend_state (info)->ctx), TRUE, TRUE);
   2719         }
Target 0: (WatchWCSessionAppWatchOSExtension) stopped.
(lldb) bt
* thread #9, name = 'tid_a31f', queue = 'NSOperationQueue 0x8069b670 (QOS: UTILITY)', stop reason = breakpoint 1.1
  * frame #0: 0x00222870 WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical(info=0x81365400, user_data=0xb04dc718) at debugger-agent.c:2716:6
    frame #1: 0x004e177b WatchWCSessionAppWatchOSExtension`mono_thread_info_safe_suspend_and_run(id=0xb0767000, interrupt_kernel=0, callback=(WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical at debugger-agent.c:2708), user_data=0xb04dc718) at mono-threads.c:1358:19
    frame #2: 0x00222799 WatchWCSessionAppWatchOSExtension`notify_thread(key=0x03994508, value=0x81378c00, user_data=0x00000000) at debugger-agent.c:2747:2
    frame #3: 0x00355bd8 WatchWCSessionAppWatchOSExtension`mono_g_hash_table_foreach(hash=0x8007b4e0, func=(WatchWCSessionAppWatchOSExtension`notify_thread at debugger-agent.c:2733), user_data=0x00000000) at mono-hash.c:310:4
    frame #4: 0x0021e955 WatchWCSessionAppWatchOSExtension`suspend_vm at debugger-agent.c:2844:3
    frame #5: 0x00225ff0 WatchWCSessionAppWatchOSExtension`process_event(event=EVENT_KIND_THREAD_START, arg=0x039945d0, il_offset=0, ctx=0x00000000, events=0x805b28e0, suspend_policy=2) at debugger-agent.c:4012:3
    frame #6: 0x00227c7c WatchWCSessionAppWatchOSExtension`process_profiler_event(event=EVENT_KIND_THREAD_START, arg=0x039945d0) at debugger-agent.c:4072:2
    frame #7: 0x0021b174 WatchWCSessionAppWatchOSExtension`thread_startup(prof=0x00000000, tid=2957889536) at debugger-agent.c:4149:2
    frame #8: 0x0037912d WatchWCSessionAppWatchOSExtension`mono_profiler_raise_thread_started(tid=2957889536) at profiler-events.h:103:1
    frame #9: 0x003d53da WatchWCSessionAppWatchOSExtension`fire_attach_profiler_events(tid=0xb04dd000) at threads.c:1120:2
    frame #10: 0x003d4d83 WatchWCSessionAppWatchOSExtension`mono_thread_attach(domain=0x801750a0) at threads.c:1547:2
    frame #11: 0x003df1a1 WatchWCSessionAppWatchOSExtension`mono_threads_attach_coop_internal(domain=0x801750a0, cookie=0xb04dcc0c, stackdata=0xb04dcba8) at threads.c:6008:3
    frame #12: 0x003df287 WatchWCSessionAppWatchOSExtension`mono_threads_attach_coop(domain=0x00000000, dummy=0xb04dcc0c) at threads.c:6045:9
    frame #13: 0x005034b8 WatchWCSessionAppWatchOSExtension`::xamarin_switch_gchandle(self=0x80762c20, to_weak=false) at runtime.m:1805:2
    frame #14: 0x005065c1 WatchWCSessionAppWatchOSExtension`::xamarin_retain_trampoline(self=0x80762c20, sel="retain") at trampolines.m:693:2
    frame #15: 0x657ea520 libobjc.A.dylib`objc_retain + 64
    frame #16: 0x4b4d9caa WatchConnectivity`__66-[WCSession onqueue_handleDictionaryMessageRequest:withPairingID:]_block_invoke + 279
    frame #17: 0x453c7df7 Foundation`__NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__ + 12
    frame #18: 0x453c7cf4 Foundation`-[NSBlockOperation main] + 88
    frame #19: 0x453cacee Foundation`__NSOPERATION_IS_INVOKING_MAIN__ + 27
    frame #20: 0x453c6ebd Foundation`-[NSOperation start] + 835
    frame #21: 0x453cb606 Foundation`__NSOPERATIONQUEUE_IS_STARTING_AN_OPERATION__ + 27
    frame #22: 0x453cb12e Foundation`__NSOQSchedule_f + 194
    frame #23: 0x453cb26e Foundation`____addOperations_block_invoke_4 + 20
    frame #24: 0x65de007b libdispatch.dylib`_dispatch_call_block_and_release + 15
    frame #25: 0x65de126f libdispatch.dylib`_dispatch_client_callout + 14
    frame #26: 0x65de3788 libdispatch.dylib`_dispatch_continuation_pop + 421
    frame #27: 0x65de2ee3 libdispatch.dylib`_dispatch_async_redirect_invoke + 818
    frame #28: 0x65df087d libdispatch.dylib`_dispatch_root_queue_drain + 354
    frame #29: 0x65df0ff3 libdispatch.dylib`_dispatch_worker_thread2 + 109
    frame #30: 0x66024fa0 libsystem_pthread.dylib`_pthread_wqthread + 208
    frame #31: 0x66024e44 libsystem_pthread.dylib`start_wqthread + 36
```

Going further, `info` is about this thread:
```
(lldb) p/x *(int *)(((char *) info->node.key) + 0xa0)
(int) $2 = 0x01243f93
(lldb) thread list
Process 12832 stopped
  thread #1: tid = 0x1243ee1, 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10, name = 'tid_303', queue = 'com.apple.main-thread'
  thread #2: tid = 0x1243ee6, 0x65f816e2 libsystem_kernel.dylib`__recvfrom + 10
  thread #3: tid = 0x1243ee7, 0x65f81aea libsystem_kernel.dylib`__psynch_cvwait + 10, name = 'SGen worker'
  thread #4: tid = 0x1243ee9, 0x65f7e3d2 libsystem_kernel.dylib`semaphore_wait_trap + 10, name = 'Finalizer'
  thread #5: tid = 0x1243eea, 0x65f816e2 libsystem_kernel.dylib`__recvfrom + 10, name = 'Debugger agent'
  thread #6: tid = 0x1243f1d, 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10, name = 'com.apple.uikit.eventfetch-thread'
  thread #8: tid = 0x1243f93, 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10, name = 'tid_6d0f', queue = 'NSOperationQueue 0x8069b300 (QOS: UTILITY)'
* thread #9: tid = 0x12443a9, 0x00222870 WatchWCSessionAppWatchOSExtension`debugger_interrupt_critical(info=0x81365400, user_data=0xb04dc718) at debugger-agent.c:2716:6, name = 'tid_a31f', queue = 'NSOperationQueue 0x8069b670 (QOS: UTILITY)', stop reason = breakpoint 1.1
  thread #10: tid = 0x1244581, 0x65f7fd32 libsystem_kernel.dylib`__workq_kernreturn + 10
(lldb) thread select 8
* thread #8, name = 'tid_6d0f', queue = 'NSOperationQueue 0x8069b300 (QOS: UTILITY)'
    frame #0: 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10
libsystem_kernel.dylib`mach_msg_trap:
->  0x65f7e396 <+10>: retl
    0x65f7e397 <+11>: nop

libsystem_kernel.dylib`mach_msg_overwrite_trap:
    0x65f7e398 <+0>:  movl   $0xffffffe0, %eax         ; imm = 0xFFFFFFE0
    0x65f7e39d <+5>:  calll  0x65f85f44                ; _sysenter_trap
(lldb) bt
* thread #8, name = 'tid_6d0f', queue = 'NSOperationQueue 0x8069b300 (QOS: UTILITY)'
  * frame #0: 0x65f7e396 libsystem_kernel.dylib`mach_msg_trap + 10
    frame #1: 0x65f7e8ff libsystem_kernel.dylib`mach_msg + 47
    frame #2: 0x66079679 libxpc.dylib`_xpc_send_serializer + 104
    frame #3: 0x660794da libxpc.dylib`_xpc_pipe_simpleroutine + 80
    frame #4: 0x66079852 libxpc.dylib`xpc_pipe_simpleroutine + 43
    frame #5: 0x66043a8f libsystem_trace.dylib`___os_activity_stream_reflect_block_invoke_2 + 30
    frame #6: 0x65de126f libdispatch.dylib`_dispatch_client_callout + 14
    frame #7: 0x65de3d71 libdispatch.dylib`_dispatch_block_invoke_direct + 257
    frame #8: 0x65de3c62 libdispatch.dylib`dispatch_block_perform + 112
    frame #9: 0x6604349a libsystem_trace.dylib`_os_activity_stream_reflect + 725
    frame #10: 0x6604ef19 libsystem_trace.dylib`_os_log_impl_stream + 468
    frame #11: 0x6604e44d libsystem_trace.dylib`_os_log_impl_flatten_and_send + 6410
    frame #12: 0x6604cb3b libsystem_trace.dylib`_os_log + 137
    frame #13: 0x6604f4aa libsystem_trace.dylib`_os_log_impl + 31
    frame #14: 0x4b4eb4e9 WatchConnectivity`WCSerializePayloadDictionary + 393
    frame #15: 0x4b4d7c4d WatchConnectivity`-[WCSession onqueue_sendResponseDictionary:identifier:] + 195
    frame #16: 0x4b4da435 WatchConnectivity`__66-[WCSession onqueue_handleDictionaryMessageRequest:withPairingID:]_block_invoke.411 + 35
    frame #17: 0x453c7df7 Foundation`__NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__ + 12
    frame #18: 0x453c7cf4 Foundation`-[NSBlockOperation main] + 88
    frame #19: 0x453cacee Foundation`__NSOPERATION_IS_INVOKING_MAIN__ + 27
    frame #20: 0x453c6ebd Foundation`-[NSOperation start] + 835
    frame #21: 0x453cb606 Foundation`__NSOPERATIONQUEUE_IS_STARTING_AN_OPERATION__ + 27
    frame #22: 0x453cb12e Foundation`__NSOQSchedule_f + 194
    frame #23: 0x453cb067 Foundation`____addOperations_block_invoke_2 + 20
    frame #24: 0x65dedf49 libdispatch.dylib`_dispatch_block_async_invoke2 + 77
    frame #25: 0x65de4461 libdispatch.dylib`_dispatch_block_async_invoke_and_release + 17
    frame #26: 0x65de126f libdispatch.dylib`_dispatch_client_callout + 14
    frame #27: 0x65de3788 libdispatch.dylib`_dispatch_continuation_pop + 421
    frame #28: 0x65de2ee3 libdispatch.dylib`_dispatch_async_redirect_invoke + 818
    frame #29: 0x65df087d libdispatch.dylib`_dispatch_root_queue_drain + 354
    frame #30: 0x65df0ff3 libdispatch.dylib`_dispatch_worker_thread2 + 109
    frame #31: 0x66024fa0 libsystem_pthread.dylib`_pthread_wqthread + 208
    frame #32: 0x66024e44 libsystem_pthread.dylib`start_wqthread + 36
```
which is a thread in a "light" detach state (aka. coop detach), where we only unset the domain:
https://github.com/mono/mono/blob/4cefdcb7ce2d939ee78fb45d1b4913eb3bc064fd/mono/metadata/threads.c#L6084-L6111

Fixes mono#17926
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants