-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable sending activation to dispatch queue threads #102887
Enable sending activation to dispatch queue threads #102887
Conversation
Until recently, it was not possible to send activation signal to Apple dispatch queue threads using pthread_kill. So in case a .NET code was running on such a thread, it could end up blocking runtime suspension in some cases. Recently, Apple has implemented a new API to enable sending specific signals to the dispatch queues, the dispatch_allow_send_signals. This change adds a call to that API to the PAL initialization code.
Hmm, our CI machines clearly don't have the latest SDK, so this PR will have to wait till they are updated then. |
@jkotas it seems that our CI machines cannot be updated to the xcode >= 15.4 yet due to some limitations, so I have added a temporary workaround that uses dlopen / dladdr to get the function. |
5ea3596
to
6ba16e0
Compare
Do we need the same change for native AOT? |
Is there a test that we can add to ensure that this works as expected? |
I have a native test that creates a dispatch queue thread and then attempts to send a signal to it and checks that the pthread_kill returns success. It fails without this call and succeeds with it. However, I am not sure how to make a coreclr test that would check that it works. Maybe I can just pinvoke that native test from a .NET process. That would verify that the process is set to allow sending activation signals to dispatch queue threads. |
Good point, we do. |
Run a tight managed infinite loop on the dispatch queue thread and make sure that GC triggered on a different thread does not hang. |
The added coreclr test has revealed that while the signal could be sent now, the inject_activation_handler was not being invoked on the dispatch queue thread. The problem was that all signals are masked by default there.
The test verifies that .NET code running on a dispatch queue thread in a tight loop can get the activation injected.
@jkotas I have added the test. It has revealed that the fix was missing one piece. We also need to unmask the activation signal on the threads that join the runtime, since the dispatch queue ones have it masked off by default. |
Build breaks... |
@@ -1034,6 +1046,27 @@ REDHAWK_PALEXPORT UInt32_BOOL REDHAWK_PALAPI PalRegisterHijackCallback(_In_ PalH | |||
ASSERT(g_pHijackCallback == NULL); | |||
g_pHijackCallback = callback; | |||
|
|||
#ifdef __APPLE__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this going to work for iOS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should, the new API supports iOS too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is dlopen
of a random library fine wrt. app store rules?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, but I hope that we will be able to remove the dlopen very soon, it is a matter of upgrading macOS on our CI machines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @ivanpovazan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file is available at /usr/lib/system/introspection/libdispatch.dylib
:
ivanpovazan@EONE-MAC-ARM64 ~ $ find /usr/lib -name "libdispatch.dylib" -exec realpath {} \;
/usr/lib/system/introspection/libdispatch.dylib
ivanpovazan@EONE-MAC-ARM64 ~ $ sw_vers -productVersion
14.5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting. I also have the library in the /usr/lib/system/introspection and not in the /usr/lib/system. But when I run a simple c app that invokes that API under lldb and list the images in the process, it shows the path I have used:
[ 8] 502762EE-7AA7-306C-9DBD-88981A86BB78 0x00000001831b2000 /usr/lib/system/libdispatch.dylib
The dlopen works no matter whether I use the /usr/lib/system/libdispatch.dylib or /usr/lib/system/introspection/libdispatch.dylib. However, if I load the one in the /usr/lib/system/introspection, then lldb shows many warnings like
objc[47199]: Class OS_dispatch_workloop is implemented in both /usr/lib/system/libdispatch.dylib (0x1eb1a3ad8) and /usr/lib/system/introspection/libdispatch.dylib (0x1001d8118). One of the two will be used. Which one is undefined.
So it seems that macOS has some redirection there and the path I have in the PR is the right one to use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dylibs don't actually exist on disk in macOS anymore, they're in a system-level dyld cache, which is why 'find' or 'ls' won't find them. dlopen knows to look in the cache though, so that's why the path just works.
Ref: https://mjtsai.com/blog/2020/06/26/reverse-engineering-macos-11-0/
One way to find out which dylib a symbol is supposed to be in is by looking in the Xcode's SDKS:
$ grep dispatch_allow_send_signals /Applications/Xcode_15.3.0.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib -R
/Applications/Xcode_15.3.0.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib/libSystem.B.tbd: _dispatch_after_f, _dispatch_allocator_layout, _dispatch_allow_send_signals,
/Applications/Xcode_15.3.0.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib/system/introspection/libdispatch.tbd: _dispatch_after_f, _dispatch_allocator_layout, _dispatch_allow_send_signals,
/Applications/Xcode_15.3.0.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib/system/libdispatch.tbd: _dispatch_after_f, _dispatch_allocator_layout, _dispatch_allow_send_signals,
/Applications/Xcode_15.3.0.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib/libSystem.B_asan.tbd: _dispatch_after_f, _dispatch_allocator_layout, _dispatch_allow_send_signals,
This shows that the symbol is also in libSystem.B.
One interesting thing shows up when grepping in the iOS SDK:
$ grep dispatch_allow_send_signals /Applications/Xcode_15.3.0.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/usr/lib -R
/Applications/Xcode_15.3.0.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/usr/lib/libSystem.B.tbd: _dispatch_after_f, _dispatch_allocator_layout, _dispatch_allow_send_signals,
/Applications/Xcode_15.3.0.app/Contents/Developer/Platforms/iPhoneOS.platform/Developer/SDKs/iPhoneOS.sdk/usr/lib/libSystem.B_asan.tbd: _dispatch_after_f, _dispatch_allocator_layout, _dispatch_allow_send_signals,
The symbol is only available in libSystem, not libdispatch (the same happens in tvOS too).
My recommendation would be to dlopen libSystem instead ("/usr/lib/libSystem.dylib"), hopefully that will work on all platforms.
Ideally you'd be able to remove the call to dlopen, that would reliably work on all platforms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally you'd be able to remove the call to dlopen, that would reliably work on all platforms.
Right, I am really waiting for the CI machines to get updated.
I'll give the libSystem a try, hopefully it would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rolfbjarne thank you for the suggestion, the libSystem works too. I'll update this PR.
/azp run runtime-coreclr outerloop |
Azure Pipelines failed to run 1 pipeline(s). |
/azp run runtime-coreclr outerloop |
Azure Pipelines failed to run 1 pipeline(s). |
/azp run runtime-coreclr outerloop |
Azure Pipelines failed to run 1 pipeline(s). |
I have just realized I also need to skip the test on macOS that doesn't support the new API, as it would obviously hang. |
/azp list |
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-coreclr outerloop |
Azure Pipelines successfully started running 1 pipeline(s). |
Until recently, it was not possible to send activation signal to Apple dispatch queue threads using pthread_kill. So in case a .NET code was running on such a thread, it could end up blocking runtime suspension in some cases. Recently, Apple has implemented a new API to enable sending specific signals to the dispatch queues, the dispatch_allow_send_signals.
This change adds a call to that API to the PAL initialization code.