[TOPI][x86] Introduce schedule_injective_from_existing and unify external schedules for all targets #3983

soiferj · 2019-09-20T16:43:35Z

Currently x86 schedule_extern does not work properly, and will treat extern ops as injective ops. This PR introduces a new generic function, schedule_injective_from_existing that has the core logic of schedule_injective for each target. schedule_extern then calls this method. This ends up fixing schedule_extern for many targets besides just x86.

Related to the discussion here.

@masahi @vinx13 would you be able to take a look?

topi/include/topi/x86/extern.h

soiferj · 2019-09-20T21:34:03Z

topi/python/topi/generic/extern.py

+    for out in outs:
+        if isinstance(out.op, tvm.tensor.ExternOp):
+            continue
+        _schedule_injective(out.op, s)


@vinx13, I have moved this logic from cuda/extern.py to generic/extern.py. Will schedule_injective still call the correct overridden function per target? Or will this just call the default one?

If it just calls the default one, it seems like I have to add a new file, x86/extern.py

Update: it seems like it does call the right one

topi/include/topi/x86/extern.h

masahi · 2019-09-20T23:50:28Z

@soiferj if you change the default behavior of schedule_extern in topi/python/topi/generic/extern.py, you should update topi/include/topi/generic/extern.h too.

I would rather move your implementation of schedule_extern in x86/extern.h to generic/extern.h, and use it from both python and x86 cpp.

soiferj · 2019-09-20T23:52:53Z

Sure, I can work on that. Can you give me a pointer for how to call the cpp schedule from Python?

masahi · 2019-09-21T00:22:14Z

Following might be useful. Look for cpp.generic or cpp.nn
https://github.com/dmlc/tvm/blob/master/topi/python/topi/generic/vision.py#L53
https://github.com/dmlc/tvm/blob/master/topi/python/topi/nn/pooling.py#L114

soiferj · 2019-09-21T23:16:40Z

x86/injective.h and generic/injective.h are different. If I move the logic into generic/extern.h, how do I make sure I'm calling the right injective function? In Python, it seems like the right overridden function is automatically called. This doesn't seem like the case in C++. If we can solve this, then we can actually have all of the logic be in generic/extern.h and have all it properly call each target's schedule_injective.

Sorry for all of the questions, I'm new to this code

masahi · 2019-09-22T02:09:38Z

I see, you are right about the lack of target dispatch mechanism in our c++ topi (introduced for python in #556). It seems such dynamism is not possible in c++ topi at the moment.

I can think of two options:

Keep generic/extern.py and generic/extern.h as is, and duplicate existing logic in cuda/extern.py to x86/extern.py or x86/extern.h
We make our desired change to generic/extern.py and remove generic/extern.h . It requires many code change, but it makes our codebase more consistent. For example, the cpp schedule_extern is used here, but this is not correct. It should call topi::cuda::schedule_extern(...) directly. This bug is a result of line-by-line porting of python topi to c++.

What do you think? @soiferj @vinx13

soiferj · 2019-09-22T16:01:25Z

Personally, I feel that the more logic we put in C++, the better. That way, we only have to implement things once, and users can use the Python or C++ API with the exact same features.

How about this: I'll implement this change like your first suggestion (duplicate logic in x86/extern.h) because this bug should be fixed urgently.

I'll also post an RFC on the forum about adding a target dispatch API for C++. In fact, the schedules are already registered as generic functions in C++ here. If we have an API like topi::generic::dispatch("schedule_injective"), we can call this in C++ and always have the right implementation per target.

What do you think?

vinx13 · 2019-09-23T00:07:26Z

@soiferj I agree that putting more things to C++ side is better. Lack of target dispatch in C++ can be confusing to users as they only need to call topi::generic::.... It would be helpful to support that in C++.

masahi · 2019-09-23T00:34:00Z

@soiferj Great! I thought implementing the target dispatch in C++ would not be straightforward, but if you are willing to do it I'm happy to help. You can continue with this PR as you see fit and I'll merge this ASAP.

soiferj · 2019-09-23T18:27:34Z

topi/include/topi/generic/extern.h

+      continue;
+    }
+    Array<Tensor> new_outs = { out };
+    tvm::GenericFunc::Get("schedule_injective")(new_outs);


@masahi or @vinx13 , this call seems to work as expected. It calls the correct schedule_injective function for the current target. However, the function that it calls creates its own schedule (see cuda/injective.py). This causes failures in the unit tests. This is probably why the previous implementation used a helper function, _schedule_injective.py). Can you think of any way to fix this when calling from C++?

What's the problem with cuda/injective.py? Is overriding native generic from Python side a problem in your case?

The specific error is Direct host side access to device memory is detected in fused_nn_conv2d_multiply_add_nn_relu_1. Did you forget to bind?. This is being hit in tutorials/frontend/using_external_lib.py

I think it has to do with the fact that schedule_injective creates its own schedule, but I'm not totally sure.

Update: I just tried changing override_native_generic_func to generic_func and it has the same issue.

Thanks a lot for all of the help, btw. I'm just trying to avoid duplicating code :( I think if we solve this, we can open the door to cleaner refactoring.

You are right, we can't create a new schedule using schedule_injective. So the problem is that we don't want to duplicate the helper function in C++ right?

Yes, exactly. Maybe we should create a new generic function? Or have a generic function that is overridden with more arguments? Is that possible?

I don't think generic func can be overridden by number of args. We can register the function as a new generic func

soiferj · 2019-09-23T20:57:36Z

I just added a new generic function schedule_injective_from_existing. I tried to update all of the core scheduling logic for all targets into this function, and changed all of the places that used _schedule_injective to call this function instead. Let me know what you think.

I see a lot of areas to refactor - in another change I think I'll move all of the schedule_injective logic into generic/injective.h, and move all of the schedule_injective_from_existing logic out of Python and into C++. Right now it has to be duplicated because the unit tests rely on it being in C++, but Python has the updated logic.

soiferj · 2019-09-25T14:04:54Z

Yeah, I am little confused by that too. I tried to mirror what schedule_injective does. The C++ code takes in a target, and the Python code doesn’t.

In my testing, yes, the C++ code calls back into Python. I can try to remove target from the C++ interface and fix that unit test. Is that alright?

In the future, we should also remove target from the schedule function signature.

masahi · 2019-09-25T14:12:55Z

Yes, I prefer fixing the unit test and cleaning up the interface, if you can.

soiferj · 2019-09-25T14:15:56Z

Thanks a lot. Sorry for the long back-and-forth. I’m still pretty new to this codebase and am trying to make sure I do this the right way.

soiferj · 2019-09-25T16:55:02Z

This is a CPP unit test - how can I set the current target?

soiferj · 2019-09-25T17:54:22Z

@masahi or @vinx13 I am testing this new change, and while my code is being called as expected, I'm still confused as to whether this actually works. For example, I am testing the last few ops of BERT base, matmul -> bias_add -> tanh. In TVM, this will be fused to fused_nn.dense_add_tanh. When I print the outs that are passed into schedule_extern, the list has only one element, and is tanh. Because of this, schedule_extern always calls schedule_injective_from_existing. It seems like nn.dense isn't being properly scheduled.

Would one of you be able to look at this script and verify whether the behavior is expected?

https://pastebin.com/tJi2rnG2

Edit: That being said, the performance issue does seem to be resolved, I'm just a little confused as to why it works :)

vinx13 · 2019-09-25T18:22:36Z

@soiferj Only the final output of the fused group will be passed to the schedule function, and that's why we need to use traverse during scheduling (for example https://github.com/dmlc/tvm/blob/master/topi/python/topi/cuda/conv2d.py#L153)

soiferj · 2019-09-25T18:25:11Z

I see. But if you look at line 140, extern schedule returns immediately. In my example, the fused op is never fully traversed since it calls schedule_dense, which immediately returns schedule_injective for tanh. There is no traversal. Should a traversal be happening within schedule_extern?

It almost seems like the callback should be returning schedule_extern and the schedule function should only be calling traverse. I believe this only works in my case because dense is never actually scheduled.

vinx13 · 2019-09-25T18:49:41Z

This is a CPP unit test - how can I set the current target?

Target::Current

vinx13 · 2019-09-25T18:55:36Z

For your case, I'm expecting that schedule_injective_for_existing called for tanh, and add inlined (via AutoInlineInjective, which also traverse the stages)

soiferj · 2019-09-25T19:37:55Z

Thanks, it seems you're right. AutoInlineInjective seems to do the right thing here. Also, regarding the CPP unit test, the current target can be gotten by using Target::Current, but it seems it cannot be set. This unit test calls topi::cuda::schedule_injective directly, so a target needs to be set.

vinx13 · 2019-09-25T21:12:22Z

Let's fix the test. As schedules are target-specific, we expect target to be set before calling it. We can the target (EnterTargetScope) at the beginning of the test

soiferj · 2019-09-26T02:44:57Z

Ok got the tests working :)

masahi · 2019-09-26T03:04:50Z

topi/src/topi.cc

+ */
+inline PackedFunc WrapScheduleFromExisting(FTVMScheduleFromExistingBuilder builder) {
+  return PackedFunc([builder](TVMArgs args, TVMRetValue* ret) {
+    *ret = builder(args[1], args[2]);


Shouldn't it be args0 and args1?

You're right, fixed

masahi · 2019-09-26T03:12:04Z

Great @soiferj I will merge after CI.

soiferj · 2019-09-26T03:12:58Z

Awesome, thanks again for all of your help!

masahi · 2019-09-26T05:49:07Z

thanks @soiferj @vinx13 this is merged.

…rnal schedules for all targets (apache#3983) * Fix extern schedule for x86 * Register x86::schedule_extern * Fix * Fix * Replace extern.py with extern.h * Introduce new generic function schedule_injective_from_existing * Fix * Fix * Add back to C++ * Fix style * Injective schedule calls local schedule_injective_from_existing * Fix * Remove target arg from schedule_injective_from_existing * Fix docs * Try to fix unit test * Fix test * Fix other tests * Fix bug

Fix extern schedule for x86

317186f

soiferj changed the title ~~Fix extern schedule for x86~~ [TOPI] Fix extern schedule for x86 Sep 20, 2019

soiferj changed the title ~~[TOPI] Fix extern schedule for x86~~ [TOPI][x86] Fix extern schedule for x86 Sep 20, 2019

vinx13 reviewed Sep 20, 2019

View reviewed changes

topi/include/topi/x86/extern.h Outdated Show resolved Hide resolved

soiferj commented Sep 20, 2019

View reviewed changes

jonso4 added 3 commits September 20, 2019 14:34

Register x86::schedule_extern

299a80b

Fix

d7b21f8

Fix

2616393

masahi reviewed Sep 20, 2019

View reviewed changes

topi/include/topi/x86/extern.h Outdated Show resolved Hide resolved

yzhliu assigned masahi Sep 22, 2019

Replace extern.py with extern.h

4fb7063

soiferj commented Sep 23, 2019

View reviewed changes

Introduce new generic function schedule_injective_from_existing

9df9010

jonso4 added 6 commits September 23, 2019 15:47

Fix

2bdc1f1

Fix

09d5f0c

Add back to C++

5c9e9f4

Fix style

1e42314

Injective schedule calls local schedule_injective_from_existing

7da8ea2

Fix

decbb5e

soiferj requested review from masahi and vinx13 September 24, 2019 22:22

jonso4 added 3 commits September 25, 2019 09:07

Remove target arg from schedule_injective_from_existing

847bb5d

Fix docs

eab0b01

Try to fix unit test

2496cd7

jonso4 added 2 commits September 25, 2019 14:49

Fix test

419f846

Fix other tests

7aeac5e

masahi reviewed Sep 26, 2019

View reviewed changes

vinx13 approved these changes Sep 26, 2019

View reviewed changes

Fix bug

b698a15

masahi approved these changes Sep 26, 2019

View reviewed changes

masahi merged commit b330d30 into apache:master Sep 26, 2019

soiferj deleted the densefixes branch September 26, 2019 16:04

tqchen mentioned this pull request Nov 8, 2019

[RELEASE][DRAFT] TVM v0.6 Release candidate #4259

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TOPI][x86] Introduce schedule_injective_from_existing and unify external schedules for all targets #3983

[TOPI][x86] Introduce schedule_injective_from_existing and unify external schedules for all targets #3983

soiferj commented Sep 20, 2019 •

edited

Loading

soiferj Sep 20, 2019 •

edited

Loading

masahi commented Sep 20, 2019

soiferj commented Sep 20, 2019 •

edited

Loading

masahi commented Sep 21, 2019

soiferj commented Sep 21, 2019 •

edited

Loading

masahi commented Sep 22, 2019 •

edited

Loading

soiferj commented Sep 22, 2019 •

edited

Loading

vinx13 commented Sep 23, 2019 •

edited

Loading

masahi commented Sep 23, 2019

soiferj Sep 23, 2019

vinx13 Sep 23, 2019

soiferj Sep 23, 2019

soiferj Sep 23, 2019 •

edited

Loading

vinx13 Sep 23, 2019

soiferj Sep 23, 2019

vinx13 Sep 23, 2019 •

edited

Loading

soiferj commented Sep 23, 2019 •

edited

Loading

soiferj commented Sep 25, 2019

masahi commented Sep 25, 2019 •

edited

Loading

soiferj commented Sep 25, 2019

soiferj commented Sep 25, 2019

soiferj commented Sep 25, 2019 •

edited

Loading

vinx13 commented Sep 25, 2019

soiferj commented Sep 25, 2019 •

edited

Loading

vinx13 commented Sep 25, 2019

vinx13 commented Sep 25, 2019 •

edited

Loading

soiferj commented Sep 25, 2019

vinx13 commented Sep 25, 2019

soiferj commented Sep 26, 2019

masahi Sep 26, 2019

soiferj Sep 26, 2019

masahi commented Sep 26, 2019

soiferj commented Sep 26, 2019

masahi commented Sep 26, 2019

[TOPI][x86] Introduce schedule_injective_from_existing and unify external schedules for all targets #3983

[TOPI][x86] Introduce schedule_injective_from_existing and unify external schedules for all targets #3983

Conversation

soiferj commented Sep 20, 2019 • edited Loading

soiferj Sep 20, 2019 • edited Loading

Choose a reason for hiding this comment

masahi commented Sep 20, 2019

soiferj commented Sep 20, 2019 • edited Loading

masahi commented Sep 21, 2019

soiferj commented Sep 21, 2019 • edited Loading

masahi commented Sep 22, 2019 • edited Loading

soiferj commented Sep 22, 2019 • edited Loading

vinx13 commented Sep 23, 2019 • edited Loading

masahi commented Sep 23, 2019

soiferj Sep 23, 2019

Choose a reason for hiding this comment

vinx13 Sep 23, 2019

Choose a reason for hiding this comment

soiferj Sep 23, 2019

Choose a reason for hiding this comment

soiferj Sep 23, 2019 • edited Loading

Choose a reason for hiding this comment

vinx13 Sep 23, 2019

Choose a reason for hiding this comment

soiferj Sep 23, 2019

Choose a reason for hiding this comment

vinx13 Sep 23, 2019 • edited Loading

Choose a reason for hiding this comment

soiferj commented Sep 23, 2019 • edited Loading

soiferj commented Sep 25, 2019

masahi commented Sep 25, 2019 • edited Loading

soiferj commented Sep 25, 2019

soiferj commented Sep 25, 2019

soiferj commented Sep 25, 2019 • edited Loading

vinx13 commented Sep 25, 2019

soiferj commented Sep 25, 2019 • edited Loading

vinx13 commented Sep 25, 2019

vinx13 commented Sep 25, 2019 • edited Loading

soiferj commented Sep 25, 2019

vinx13 commented Sep 25, 2019

soiferj commented Sep 26, 2019

masahi Sep 26, 2019

Choose a reason for hiding this comment

soiferj Sep 26, 2019

Choose a reason for hiding this comment

masahi commented Sep 26, 2019

soiferj commented Sep 26, 2019

masahi commented Sep 26, 2019

soiferj commented Sep 20, 2019 •

edited

Loading

soiferj Sep 20, 2019 •

edited

Loading

soiferj commented Sep 20, 2019 •

edited

Loading

soiferj commented Sep 21, 2019 •

edited

Loading

masahi commented Sep 22, 2019 •

edited

Loading

soiferj commented Sep 22, 2019 •

edited

Loading

vinx13 commented Sep 23, 2019 •

edited

Loading

soiferj Sep 23, 2019 •

edited

Loading

vinx13 Sep 23, 2019 •

edited

Loading

soiferj commented Sep 23, 2019 •

edited

Loading

masahi commented Sep 25, 2019 •

edited

Loading

soiferj commented Sep 25, 2019 •

edited

Loading

soiferj commented Sep 25, 2019 •

edited

Loading

vinx13 commented Sep 25, 2019 •

edited

Loading