-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add conversions from vm list operations to emitc #5350
Conversation
iree/compiler/Dialect/VM/Conversion/VMToEmitC/ConvertVMToEmitC.cpp
Outdated
Show resolved
Hide resolved
iree/compiler/Dialect/VM/Conversion/VMToEmitC/ConvertVMToEmitC.cpp
Outdated
Show resolved
Hide resolved
iree/compiler/Dialect/VM/Conversion/VMToEmitC/ConvertVMToEmitC.cpp
Outdated
Show resolved
Hide resolved
Are we expected to wrap the lists in refs like how it's done in the bytecode dispatcher, or should this be seen as an "implementation detail" of the interpreter? Current state of the PR (+ deallocation with list API) iree_vm_type_def_t element_type = iree_vm_type_def_make_value_type(IREE_VM_VALUE_TYPE_I32);
iree_vm_type_def_t* element_type_ptr = &element_type;
iree_vm_list_t* list = NULL;
iree_vm_list_t** list_ptr = &list;
iree_vm_list_create(&element_type, {initial_capacity}, state->allocator, &list);
...
int32_t size = iree_vm_list_size(list);
...
iree_vm_list_release(list); // That's missing currently Dispatcher (as far as I understand it) iree_vm_type_def_t element_type = iree_vm_type_def_make_value_type(IREE_VM_VALUE_TYPE_I32);
iree_vm_type_def_t* element_type_ptr = &element_type;
iree_vm_ref_t list_ref = {0};
iree_vm_list_t* list = NULL;
iree_vm_list_t** list_ptr = &list;
iree_vm_list_create(&element_type, {initial_capacity}, state->allocator, &list);
iree_vm_ref_wrap_assign(list, iree_vm_list_type_id(), &list_ref));
...
iree_vm_list_t* list2 = iree_vm_list_deref(*list_ref);
int32_t size = iree_vm_list_size(list2);
...
iree_vm_ref_release(&list_ref); |
You don't have to wrap it in a ref but you'll really want to :) |
|
||
// TODO(simon-camp): Cache the liveness analysis. | ||
ValueLiveness liveness; | ||
if (failed(liveness.recalculate(funcOp))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO is fine - note that ValueLiveness is pretty much the slowest possible piece of code ever so FYI you'll hit slowdowns on anything bigger than a unit test.
There's also mlir/Analysis/Liveness.h
upstream now which does most of the same stuff (I've been meaning to switch to it) - you may be able to use that here and at least not have to worry about my junk implementation :)
/*result=*/emitc::OpaqueType::get(ctx, "iree_vm_list_t**"), | ||
/*operand=*/listOp.getResult()); | ||
|
||
rewriter.create<emitc::CallOp>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These calls can fail - anything with iree_status_t you'll want to insert an IREE_RETURN_IF_ERROR around and propagate the error back up. This is especially important with calls to external modules as for example the HAL failing to load an executable at runtime may fail and if you ignore the error things will get crashy fast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added these, or rather something similar as we can't emit a macro invocation with a function call as argument. So the newly added test at least fails as expected.
We are leaking memory though as we don't release the refs before returning. Let's discuss directly how to properly support this.
iree/compiler/Dialect/VM/Conversion/VMToEmitC/ConvertVMToEmitC.cpp
Outdated
Show resolved
Hide resolved
iree/compiler/Dialect/VM/Conversion/VMToEmitC/ConvertVMToEmitC.cpp
Outdated
Show resolved
Hide resolved
I think this is good now - just let me know when you're done tweaking and I can merge it! |
4af6985
to
568a725
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor fix for an off-by-one on local_refs and moving the value extract to value.h, but otherwise LGTM! I can submit after those fixes
iree/vm/ops.h
Outdated
#define VM_GET_REF_ADDRESS(index) &local_refs[index] | ||
|
||
#define VM_RELEASE_REFS() \ | ||
for (int i = 0; i < IREE_ARRAYSIZE(local_refs); i++) { \ | ||
iree_vm_ref_release(VM_GET_REF_ADDRESS(i)); \ | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it'd be nice for readability if these took local_refs as an argument - that would make it clearer when looking at the code where these are getting their state from
iree/vm/ops.h
Outdated
static inline int64_t vm_list_value_extract_i64(iree_vm_value_t* value) { | ||
return value->i64; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can make this a iree_vm_value_t
method in value.h
as an iree_vm_value_get_i32
as it's not list specific and can be used with other types as well (future dict/etc). It'd fit nicely alongside iree_vm_value_make_i32
(and i64/etc) and be useful in general.
We could in the future make it assert that the type is as expected and also add an iree_vm_value_cast_i32
that would convert if needed (i64->i32, etc). Some of the list code does these casts already but we may be able to remove a lot of that code if we had these helpers.
auto ref_initializers = | ||
SmallVector<StringRef, 4>{static_cast<size_t>(numRefs), "{0}"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the max register ordinal is the inclusive max, so you need to +1 here to get the total count
%c1 = vm.const.i32 1 : i32 | ||
%list = vm.list.alloc %c1 : (i32) -> !vm.list<i32> | ||
vm.list.set.i32 %list, %c0, %c1 : (!vm.list<i32>, i32, i32) | ||
vm.return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are awesome tests - thanks for adding :)
// TODO(simon-camp): This is expensive as we recalculate the | ||
// RegisterAllocation for every alloc in a function. We could make it | ||
// compatible with the analysis framework in MLIR which would cache it | ||
// automatically IIUC. See here for reference | ||
// https://mlir.llvm.org/docs/PassManagement/#analysis-management |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice todo - I think the register allocator predates the analysis stuff actually working/being able to be run from within a conversion - since you are running it outside on the FuncOp (which AFAIK will always be processed first before any nested regions) you'll get this for free when supported!
/*templateArgs=*/ArrayAttr{}, | ||
/*operands=*/ArrayRef<Value>{setOp.list()}); | ||
|
||
auto listDerefOp = rewriter.create<emitc::CallOp>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a good example of where emitc->emitc canonicalizations can be useful - all these derefs and such can CSE/fold away in a lot of cases before emitting code, which saves the C compiler processing the code from having to perform alias analysis and a bunch of other gnarly things that are unlikely to work right:
In the gist you posted for example v12/v13 and v15/v16 could fold away, as could the checks on them.
v12 = *(v9);
v13 = iree_vm_list_deref(v12);
VM_RETURN_IF_LIST_NULL(v13);
v14 = iree_vm_list_reserve(v13, v2);
VM_RETURN_IF_ERROR(v14);
v15 = *(v9);
v16 = iree_vm_list_deref(v15);
VM_RETURN_IF_LIST_NULL(v16);
v17 = iree_vm_list_size(v16);
->
v12 = *(v9);
v13 = iree_vm_list_deref(v12);
VM_RETURN_IF_LIST_NULL(v13);
v14 = iree_vm_list_reserve(v13, v2);
VM_RETURN_IF_ERROR(v14);
v17 = iree_vm_list_size(v13);
Which then can go even further for local values:
v7 = &v6;
v8 = iree_vm_list_create(v5, v1, state->allocator, v7);
VM_RETURN_IF_ERROR(v8);
v9 = VM_GET_REF_ADDRESS(0);
v10 = iree_vm_list_type_id();
v11 = iree_vm_ref_wrap_assign(v6, v10, v9);
VM_RETURN_IF_ERROR(v11);
v12 = *(v9);
v13 = iree_vm_list_deref(v12);
VM_RETURN_IF_LIST_NULL(v13);
v14 = iree_vm_list_reserve(v13, v2);
->
v7 = &v6;
v8 = iree_vm_list_create(v5, v1, state->allocator, v7);
VM_RETURN_IF_ERROR(v8);
v14 = iree_vm_list_reserve(v7, v2);
that's where the power of emitc will really shine - relying on the downstream compiler to do that same transformation with a bunch of those interleaved function calls to external code is never going to be as good as what you can do here :)
(<3 emitc :)
We reuse the register allocation pass and emit an array of refs instead of one ref per value, similar to the bytecode module.
This adds basic support for all list operations except for the ones dealing with refs.
Things missing include lists as function arguments/results and basic block arguments.
Co-authored-by: Marius Brehler [email protected]