Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add conversions from vm list operations to emitc #5350

Merged
merged 14 commits into from
Apr 21, 2021

Conversation

simon-camp
Copy link
Contributor

@simon-camp simon-camp commented Apr 8, 2021

This adds basic support for all list operations except for the ones dealing with refs.

Things missing include lists as function arguments/results and basic block arguments.

Co-authored-by: Marius Brehler [email protected]

@google-cla google-cla bot added the cla: yes label Apr 8, 2021
@simon-camp
Copy link
Contributor Author

simon-camp commented Apr 9, 2021

Are we expected to wrap the lists in refs like how it's done in the bytecode dispatcher, or should this be seen as an "implementation detail" of the interpreter?

Current state of the PR (+ deallocation with list API)

iree_vm_type_def_t element_type = iree_vm_type_def_make_value_type(IREE_VM_VALUE_TYPE_I32);
iree_vm_type_def_t* element_type_ptr = &element_type;
iree_vm_list_t* list = NULL;
iree_vm_list_t** list_ptr = &list;
iree_vm_list_create(&element_type, {initial_capacity}, state->allocator, &list);
...
int32_t size = iree_vm_list_size(list);
...
iree_vm_list_release(list); // That's missing currently

Dispatcher (as far as I understand it)

iree_vm_type_def_t element_type = iree_vm_type_def_make_value_type(IREE_VM_VALUE_TYPE_I32);
iree_vm_type_def_t* element_type_ptr = &element_type;
iree_vm_ref_t list_ref = {0};
iree_vm_list_t* list = NULL;
iree_vm_list_t** list_ptr = &list;
iree_vm_list_create(&element_type, {initial_capacity}, state->allocator, &list);
iree_vm_ref_wrap_assign(list, iree_vm_list_type_id(), &list_ref));
...
iree_vm_list_t* list2 = iree_vm_list_deref(*list_ref);
int32_t size = iree_vm_list_size(list2);
...
iree_vm_ref_release(&list_ref);

@benvanik
Copy link
Collaborator

benvanik commented Apr 9, 2021

You don't have to wrap it in a ref but you'll really want to :)
Ideally you should not have to do any special handling for lifetime of list and instead be treating any ref/list/future dict type/etc as the same thing. There's some optimization opportunity where we could use stack-based lists or something but those are ages away. Stuff gets way more complex if you try to specialize, as then things like loading a list from within another list, passing lists to functions, storing lists in variables, or a heterogenous list with multiple types inside of it - some of which are lists and others which are other ref types - get really difficult to handle. Stella is adding some such constructs right now for the core tensorflow/xla import pipelines so you'll want to be building for that generality :)

iree/vm/test/list_ops.mlir Outdated Show resolved Hide resolved

// TODO(simon-camp): Cache the liveness analysis.
ValueLiveness liveness;
if (failed(liveness.recalculate(funcOp))) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO is fine - note that ValueLiveness is pretty much the slowest possible piece of code ever so FYI you'll hit slowdowns on anything bigger than a unit test.

There's also mlir/Analysis/Liveness.h upstream now which does most of the same stuff (I've been meaning to switch to it) - you may be able to use that here and at least not have to worry about my junk implementation :)

/*result=*/emitc::OpaqueType::get(ctx, "iree_vm_list_t**"),
/*operand=*/listOp.getResult());

rewriter.create<emitc::CallOp>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These calls can fail - anything with iree_status_t you'll want to insert an IREE_RETURN_IF_ERROR around and propagate the error back up. This is especially important with calls to external modules as for example the HAL failing to load an executable at runtime may fail and if you ignore the error things will get crashy fast.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added these, or rather something similar as we can't emit a macro invocation with a function call as argument. So the newly added test at least fails as expected.

We are leaking memory though as we don't release the refs before returning. Let's discuss directly how to properly support this.

@benvanik
Copy link
Collaborator

I think this is good now - just let me know when you're done tweaking and I can merge it!

@simon-camp simon-camp force-pushed the emitc-list branch 4 times, most recently from 4af6985 to 568a725 Compare April 20, 2021 10:40
@simon-camp
Copy link
Contributor Author

simon-camp commented Apr 20, 2021

@benvanik I've implemented the handling of ref types on top of the register allocation as discussed last week.

However, it seems it is over estimating the number of registers needed. (Or its something about scratch registers, haven't looked into the code to deeply)

I've created some files here.

Copy link
Collaborator

@benvanik benvanik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor fix for an off-by-one on local_refs and moving the value extract to value.h, but otherwise LGTM! I can submit after those fixes

iree/vm/ops.h Outdated
Comment on lines 246 to 237
#define VM_GET_REF_ADDRESS(index) &local_refs[index]

#define VM_RELEASE_REFS() \
for (int i = 0; i < IREE_ARRAYSIZE(local_refs); i++) { \
iree_vm_ref_release(VM_GET_REF_ADDRESS(i)); \
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it'd be nice for readability if these took local_refs as an argument - that would make it clearer when looking at the code where these are getting their state from

iree/vm/ops.h Outdated
Comment on lines 155 to 157
static inline int64_t vm_list_value_extract_i64(iree_vm_value_t* value) {
return value->i64;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can make this a iree_vm_value_t method in value.h as an iree_vm_value_get_i32 as it's not list specific and can be used with other types as well (future dict/etc). It'd fit nicely alongside iree_vm_value_make_i32 (and i64/etc) and be useful in general.

We could in the future make it assert that the type is as expected and also add an iree_vm_value_cast_i32 that would convert if needed (i64->i32, etc). Some of the list code does these casts already but we may be able to remove a lot of that code if we had these helpers.

Comment on lines 372 to 373
auto ref_initializers =
SmallVector<StringRef, 4>{static_cast<size_t>(numRefs), "{0}"};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the max register ordinal is the inclusive max, so you need to +1 here to get the total count

%c1 = vm.const.i32 1 : i32
%list = vm.list.alloc %c1 : (i32) -> !vm.list<i32>
vm.list.set.i32 %list, %c0, %c1 : (!vm.list<i32>, i32, i32)
vm.return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are awesome tests - thanks for adding :)

Comment on lines +413 to +421
// TODO(simon-camp): This is expensive as we recalculate the
// RegisterAllocation for every alloc in a function. We could make it
// compatible with the analysis framework in MLIR which would cache it
// automatically IIUC. See here for reference
// https://mlir.llvm.org/docs/PassManagement/#analysis-management
Copy link
Collaborator

@benvanik benvanik Apr 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice todo - I think the register allocator predates the analysis stuff actually working/being able to be run from within a conversion - since you are running it outside on the FuncOp (which AFAIK will always be processed first before any nested regions) you'll get this for free when supported!

/*templateArgs=*/ArrayAttr{},
/*operands=*/ArrayRef<Value>{setOp.list()});

auto listDerefOp = rewriter.create<emitc::CallOp>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good example of where emitc->emitc canonicalizations can be useful - all these derefs and such can CSE/fold away in a lot of cases before emitting code, which saves the C compiler processing the code from having to perform alias analysis and a bunch of other gnarly things that are unlikely to work right:

In the gist you posted for example v12/v13 and v15/v16 could fold away, as could the checks on them.

v12 = *(v9);
v13 = iree_vm_list_deref(v12);
VM_RETURN_IF_LIST_NULL(v13);
v14 = iree_vm_list_reserve(v13, v2);
VM_RETURN_IF_ERROR(v14);
v15 = *(v9);
v16 = iree_vm_list_deref(v15);
VM_RETURN_IF_LIST_NULL(v16);
v17 = iree_vm_list_size(v16);

->

v12 = *(v9);
v13 = iree_vm_list_deref(v12);
VM_RETURN_IF_LIST_NULL(v13);
v14 = iree_vm_list_reserve(v13, v2);
VM_RETURN_IF_ERROR(v14);
v17 = iree_vm_list_size(v13);

Which then can go even further for local values:

v7 = &v6;
v8 = iree_vm_list_create(v5, v1, state->allocator, v7);
VM_RETURN_IF_ERROR(v8);
v9 = VM_GET_REF_ADDRESS(0);
v10 = iree_vm_list_type_id();
v11 = iree_vm_ref_wrap_assign(v6, v10, v9);
VM_RETURN_IF_ERROR(v11);
v12 = *(v9);
v13 = iree_vm_list_deref(v12);
VM_RETURN_IF_LIST_NULL(v13);
v14 = iree_vm_list_reserve(v13, v2);

->

v7 = &v6;
v8 = iree_vm_list_create(v5, v1, state->allocator, v7);
VM_RETURN_IF_ERROR(v8);
v14 = iree_vm_list_reserve(v7, v2);

that's where the power of emitc will really shine - relying on the downstream compiler to do that same transformation with a bunch of those interleaved function calls to external code is never going to be as good as what you can do here :)

(<3 emitc :)

@benvanik benvanik merged commit 5eae15e into iree-org:main Apr 21, 2021
@simon-camp simon-camp deleted the emitc-list branch April 22, 2021 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants