-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding experimental synchronous executor using inline command buffers. #5509
Conversation
45ab4b3
to
66ed8b7
Compare
66ed8b7
to
9fcf54d
Compare
(typo'ed my own name, but github so it stays :) |
b2f34a3
to
a01e9ff
Compare
// on threads issuing submissions. |loaders| is the set of executable | ||
// loaders that are available for loading in the device context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see... we could even pass a wasm module loader in here :O
So with #4298, #5096 (still blocked on the choice of wasm runtime / memory allocation architecture) also wouldn't even add its own driver module. We'd just conditionally include the executable loader. So each host executable loader will be compatible with the synchronous and asynchronous local HAL devices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep! that's the goal - we can still have a registry of them/compile-time include everything linked in for the iree-run-module cases but tiny IREE cases can conditionally include what they want. it's also how the static library loader works: it's just a list of pointers that you provide for the HAL to lookup in the future: https://github.com/google/iree/blob/f02a32f88242479c37a249d1c6bbe7637545ca36/iree/hal/local/loaders/static_library_loader.h#L44-L48
(this lets you support both dynamic and static models, if you wanted to have some builtin and some deployable, etc)
// iree_hal_sync_semaphore_t | ||
//===----------------------------------------------------------------------===// | ||
|
||
// Creates a semaphore that allows for ordering of operations on the local host. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate a bit more on this (what a "sync semaphore" is)? This is backed by a 400LOC .c
, which is only a bit less than task_semaphore.c
. Seems worth a comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a note -- semaphores are mechanisms to propagate errors, query for completion, etc and are useful whether synchronously executing work or not, and having them in both cases lets us have a consistent API to the application (and make use of them in the compiled output) without having to special case things. Even on bare-metal systems using the synchronous driver they may still have multiple cores and want to query the status of work from one core to another - semaphores are a way to do that in a thread-safe fashion (even if your threads are not traditional threads but independent execution units). On super bare-metal systems with single cores/single threaded usage the iree_slim_mutex_t/iree_notification_t types can be no-oped, in which case this becomes just a way to carry around an integer payload and a status field.
// batching up signals will reduce synchronization overhead. | ||
iree_status_t iree_hal_sync_semaphore_multi_signal( | ||
iree_hal_sync_semaphore_state_t* shared_state, | ||
const iree_hal_semaphore_list_t* semaphore_list); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these semaphores assumed to be created by iree_hal_sync_semaphore_create
? looks like the iree_hal_semaphore_t
type is reused for each implementation. Can implementations be mixed in a list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah not legal to mix HAL types from other devices (we'd need explicit import/export APIs for that), so safe to assume this (some places have some type guards but more could be added whenever we allow multiple simultaneous HAL usage - today there's only ever one HAL device at a time so not something that can be hit)
9fcf54d
to
a2d448b
Compare
This is mostly just the plumbing to enable #4680 by defining a new
dylib-sync
runtime driver (pending #4298 to cleanup drivers).This driver does not support recording/replaying command buffers and has some gotchas with semaphores (no deadline waits, probably corner cases for things not covered in the CTS, etc). It should get us something that builds easier on tiny systems, though.
This uses
iree/base/synchronization.h
for the semaphore implementation. The intent is that for bare-metal systems we can implement a version ofiree/base/synchronization.h
observing the specific requirements of the platform while still preserving the semaphore semantics - it's still useful to use this driver with systems with multiple threads (lots of dual-core MCUs/MPUs) or to enable pipelining. Systems that have no threads can no-op the mutex/notification (or disable interrupts/etc).Trace of bert with
-driver=dylib-sync
looks pretty good:(the large __init block at startup and free at shutdown will be solved by #5504 and #5472)
Note that due to the design here there are no changes needed in the compiler/runtime besides the single flag added in #5508 so the same exact module runs async as well:
That's pretty cool :)
Progress on #4680.