-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug.Relay.InferType] Type Inference Report Mismatch after One Operator Is Removed #8432
Comments
After one day debugging, I find the root cause of this issue, it is the below "if" statement. tvm/src/relay/transforms/type_infer.cc Lines 557 to 559 in 1d7a9e9
With the help of git history, it is added by the PR #2437. Just use the above test case to describe why the issue will happen, before the 1st invocation of "InferType" pass, the types of function are something like below.
The return type of the function is not defined, so
After my pass "SimplifyPad", the "nn.pad" operator is removed, and then the 2nd invocation of pass "InferType" will happen, the types of function at this time are something like below.
The key difference at this time is the return type of the function, it is defined and is "Tensor[(1, 224, 224, 64), int32]", because the last expression of this function is "nn.conv2d", so the return type of "nn.conv2d" is the return type of the function. With the code of line 557~559, the return type of "nn.conv2d" is changed from "IncompleteTypeNode(0, 0xYYYYYY)" to "Tensor[(1, 224, 224, 64), int32]" too. Then the function "Conv2DRel" is called to infer the return type of this "nn.conv2d", but the item of its parameter "types" is "Tensor[(1, 224, 224, 64), int32]" instead of "IncompleteTypeNode(0, 0xYYYYYY)", the type inference logic of the function "Conv2DRel" give out the return type of "nn.conv2d" should be "Tensor[(1, 218, 218, 64), int32]", then the logic of function "tvm::relay::TypeSolver::Unifier::Unify" think the return type of "nn.conv2d" is infered to be 2 different one, so it report the error message. After analyzing this issue will happen as long as your pass will change the shape of return type of Relay function, in another word, if your pass will not change the final shape of return type of Relay function, the issue will not be triggered. @slyubomirsky @jroesch @tqchen I don't know whether we can just remove the "if" statement of L557-L559 to fix this issue, what's your opinions? Thanks. |
@slyubomirsky @jroesch @altanh would be great if you can help followup on this |
Confirmed that the error still occurs in main with the example script. |
Hi @Lunderberg @Johnson9009 , I had the same problem and I found a way to circumvent it. |
Standard Ouput and Error Message
Reproduce Test Case
Current Clue
We can see the first InferType pass works well, so before the pass "SimplifyPad" the infered output shape of "nn.conv2d" is (1, 224, 224, 64), after the pass "SimplifyPad" the operator "nn.pad" is removed, because "SimplifyPad" is a function pass so the InferType pass will be executed automatically, the error happened in this 2nd InferType pass.
tvm/src/relay/ir/transform.cc
Lines 157 to 163 in 683c5eb
When the 2nd InferType call function Conv2DRel, the value of parameter "types" is
"[TensorType([1, 224, 224, 3], int8), TensorType([7, 7, 3, 64], int8), TensorType([1, 224, 224, 64], int32)]"
, the last item of parameter "types" maybe wrong, because the value of this parameter during the 1st InferType is"[TensorType([1, 230, 230, 3], int8), TensorType([7, 7, 3, 64], int8), IncompleteTypeNode(0, 0x5d6e270)]"
.tvm/src/relay/op/nn/convolution.h
Lines 133 to 136 in 683c5eb
The type solver relevant code is hard to understand, so I want to know is this a bug? or the pass I write missing something?
Thanks a lot.
The text was updated successfully, but these errors were encountered: