[WIP] Fix .view().detach() not handled correctly by AOT autograd #1661

sangongs · 2022-10-14T02:31:28Z

This is to fix issue pytorch/pytorch#93677

Chillee · 2022-10-14T03:58:34Z

@sangongs I don't think this is the right solution. Inductor should not be responsible for handling things like autograd.

sangongs · 2022-10-14T12:45:05Z

@Chillee Thanks for having an eye on this.

Inductor should not be responsible for handling things like autograd.

Agreed.

This is just a tentative workaround for an AOT autograd issue. AOT autograd currently does not handle detach() correctly. According to @bdhirsh:

it's because we're compiling the whole thing (including the detach() call) into an autograd.Function, and autograd.Function will unconditionally mark all of its forward outputs as requiring gradients.

@bdhirsh made a fix:

We can use autograd.function's mark-nondifferentiable API, to (statically) mark those outputs as not requiring gradients

But the fix does not work as expected if the output is a view, because mark_non_differentiable takes no effect in such a case:
https://github.com/pytorch/pytorch/blob/ae45dab57e22e3d04516e7dd81ef8dbefd51bfe3/torch/csrc/autograd/custom_function.cpp#L290-L299

See the AOT autograd issue for more details: pytorch/functorch#1052

In my view, there are four options to fix this:

Leave autograd.Function as it is. Work around the is_view() problem in Inductor like this PR.
Work around the problem in AOT autograd. We are discussing this in the AOT autograd issue. But apparently @bdhirsh has concerns.
Work around the problem in Dynamo. This may take a lot of efforts.
Update autograd.Function such that mark_non_differentiable() takes effect when an output is a view. But I am not sure if this is viable.

@Chillee Do you have any suggestions?

bdhirsh · 2022-10-14T13:38:31Z

Oh tbc, my vote is probably still for (2) - manually add in some extra .detach() calls in AOTAutograd, so inductor isn't forced to worry about requires-grad-ness (thread)

jansel · 2022-10-15T21:11:36Z

We have migrated torchdynamo to torch._dynamo and will use the pytorch/pytorch repo for future development. Please resubmit this PR to https://github.com/pytorch/pytorch/

More details and instructions to port this PR over can be found in #1588

Fix .view().detach() not handled correctly by AOT autograd

3926a79

facebook-github-bot added the cla signed label Oct 14, 2022

sangongs mentioned this pull request Oct 14, 2022

Semantic discrepancy on requires_grad after compiling Tensor.detach pytorch/functorch#1052

Closed

jansel closed this Oct 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Fix .view().detach() not handled correctly by AOT autograd #1661

[WIP] Fix .view().detach() not handled correctly by AOT autograd #1661

sangongs commented Oct 14, 2022

Chillee commented Oct 14, 2022

sangongs commented Oct 14, 2022

bdhirsh commented Oct 14, 2022

jansel commented Oct 15, 2022

[WIP] Fix .view().detach() not handled correctly by AOT autograd #1661

[WIP] Fix .view().detach() not handled correctly by AOT autograd #1661

Conversation

sangongs commented Oct 14, 2022

Chillee commented Oct 14, 2022

sangongs commented Oct 14, 2022

bdhirsh commented Oct 14, 2022

jansel commented Oct 15, 2022