[fix] update the condition for aliveness of TensorWrapper #98748

kshitij12345 · 2023-04-10T15:33:40Z

Fixes #95561
Fixes #98021

pytorch-bot · 2023-04-10T15:33:43Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/98748

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit b74e160:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

kshitij12345 · 2023-04-10T17:45:10Z

test/functorch/test_eager_transforms.py

@@ -3204,13 +3204,9 @@ def f(x):

        B = 5
        x = torch.randn(B, 3)
-        with self.assertRaises(RuntimeError):
+        with self.assertRaisesRegex(RuntimeError, "Batching rule not implemented for aten::_make_dual"):


Added regex to match for clarity of expected failure.

kshitij12345 · 2023-04-10T17:47:01Z

test/functorch/test_eager_transforms.py

            vmap(f)(x)

-        x = torch.randn([])
-        with self.assertRaises(RuntimeError):
-            grad(f)(x)


The test used to fail due to this issue :

import torch from torch.func import jacfwd, jacrev, vmap, vjp, jvp, grad from functorch import make_fx from torch._C._functorch import unwrap_if_dead, is_dead_tensor_wrapper from torch.autograd.forward_ad import make_dual, dual_level torch.manual_seed(420) x = torch.randn(()) def f(x): y = torch.autograd.functional.jacobian( lambda x: x.sin().sum(), x, strategy='forward-mode', vectorize=True) return y grad(f)(x)

Output:

RuntimeError: unwrapped_count > 0 INTERNAL ASSERT FAILED at "/home/kshiteej/Pytorch/pytorch_functorch/aten/src/ATen/functorch/TensorWrapper.cpp":213, please report a bug to PyTorch. Should have at least one dead wrapper

I'm kind of confused, How does test_autograd_functional_jvp_inside_transform succeed (in that grad(f)(x) raises a RuntimeError) but in test_autograd_functional_jacfwd_inside_transform the RuntimeError isn't raised?

Also, do you know why this succeeds now? We are not creating a new torch.tensor using a list of tensors, so it sounds like there is some interesting interaction going on

It looks to be happening because of a bad interaction with legacy _vmap. If I update it to torch.func.vmap, the code fails with Batching rule not implemented for make_dual.

pytorch/torch/autograd/functional.py

Lines 476 to 477 in c377a85

outputs_before_split = _vmap(jvp)(tangents)

Also, this doesn't really succeed in a workable fashion. In the following script, it returns BatchedTensor which fails if you try to print it or compare it.

import torch from torch.func import jacfwd, jacrev, vmap, vjp, jvp, grad from functorch import make_fx from torch._C._functorch import unwrap_if_dead, is_dead_tensor_wrapper from torch.autograd.forward_ad import make_dual, dual_level torch.manual_seed(420) x = torch.randn(()) def f(x): y = torch.autograd.functional.jacobian( lambda x: x.sin().sum(), x, strategy='forward-mode', vectorize=True) return y def f_exp(x): y = jacrev(lambda x: x.sin().sum())(x) return y j = jacrev(f)(x) # print(j) # RuntimeError: Batching rule not implemented for aten::is_nonzero. We could not generate a fallback. expected_j = jacrev(f_exp)(x) print(expected_j) # Works torch.testing.assert_close(j, expected_j) # RuntimeError: Batching rule not implemented for aten::is_nonzero. We could not generate a fallback.

I don't think legacy vmap is expected to work with functorch transforms, right?

How does test_autograd_functional_jvp_inside_transform succeed (in that grad(f)(x) raises a RuntimeError) but in test_autograd_functional_jacfwd_inside_transform the RuntimeError isn't raised?

Both have a different approach to computation of jvp.

Under jacobian:

pytorch/torch/autograd/functional.py

Lines 458 to 477 in c377a85

# Step 2: Compute vmap over computation with dual tensors

def jvp(tangents):

with fwAD.dual_level():

dual_inputs = tuple(

fwAD.make_dual(input, tangent.view_as(input)) for input, tangent in zip(inputs, tangents))

_is_outputs_tuple, dual_outputs = _as_tuple(func(*dual_inputs), "outputs")

output_info.append(_is_outputs_tuple)

jv = []

primal_outs = []

for dual_out in dual_outputs:

primal, tangent = fwAD.unpack_dual(dual_out)

primal_outs.append(primal)

if tangent is not None:

jv.append(tangent)

else:

jv.append(torch.zeros_like(primal))

output_info.append(primal_outs)

return tuple(jv)

outputs_before_split = _vmap(jvp)(tangents)

Under jvp: (this one hits a different failure - "You are attempting to call Tensor.requires_grad_() (or perhaps using torch.autograd.functional.* APIs) inside of a function being transformed by a functorch transform.")

pytorch/torch/autograd/functional.py

Lines 371 to 408 in c377a85

with torch.enable_grad():

is_inputs_tuple, inputs = _as_tuple(inputs, "inputs", "jvp")

inputs = _grad_preprocess(inputs, create_graph=create_graph, need_graph=True)

if v is not None:

_, v = _as_tuple(v, "v", "jvp")

v = _grad_preprocess(v, create_graph=create_graph, need_graph=False)

_validate_v(v, inputs, is_inputs_tuple)

else:

if len(inputs) != 1 or inputs[0].nelement() != 1:

raise RuntimeError("The vector v can only be None if the input to "

"the user-provided function is a single Tensor "

"with a single element.")

outputs = func(*inputs)

is_outputs_tuple, outputs = _as_tuple(outputs, "outputs of the user-provided function", "jvp")

_check_requires_grad(outputs, "outputs", strict=strict)

# The backward is linear so the value of grad_outputs is not important as

# it won't appear in the double backward graph. We only need to ensure that

# it does not contain inf or nan.

grad_outputs = tuple(torch.zeros_like(out, requires_grad=True) for out in outputs)

grad_inputs = _autograd_grad(outputs, inputs, grad_outputs, create_graph=True)

_check_requires_grad(grad_inputs, "grad_inputs", strict=strict)

if create_graph:

with torch.enable_grad():

grad_res = _autograd_grad(grad_inputs, grad_outputs, v, create_graph=create_graph)

jvp = _fill_in_zeros(grad_res, outputs, strict, create_graph, "back_trick")

else:

grad_res = _autograd_grad(grad_inputs, grad_outputs, v, create_graph=create_graph)

jvp = _fill_in_zeros(grad_res, outputs, strict, create_graph, "back_trick")

# Cleanup objects and return them to the user

outputs = _grad_postprocess(outputs, create_graph)

jvp = _grad_postprocess(jvp, create_graph)

return _tuple_postprocess(outputs, is_outputs_tuple), _tuple_postprocess(jvp, is_outputs_tuple)

kshitij12345 · 2023-04-10T17:48:58Z

Seems like test_autograd_functional_jacfwd_inside_transform fails with accidental errors rather than intentional one. (Have added comments to the relevant line of code).

pytorch/test/functorch/test_eager_transforms.py

Lines 3199 to 3212 in 69eef5a

    
           def test_autograd_functional_jacfwd_inside_transform(self, device): 
        
               def f(x): 
        
                   y = torch.autograd.functional.jacobian( 
        
                       lambda x: x.sin().sum(), x, strategy='forward-mode', vectorize=True) 
        
                   return y 
        
               B = 5 
        
               x = torch.randn(B, 3) 
        
               with self.assertRaises(RuntimeError): 
        
                   vmap(f)(x) 
        
               x = torch.randn([]) 
        
               with self.assertRaises(RuntimeError): 
        
                   grad(f)(x)

zou3519

LGTM, thanks for clarifying

kshitij12345 · 2023-04-13T04:40:50Z

@pytorchbot merge

pytorchmergebot · 2023-04-13T04:42:57Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

[fix] update the condition for aliveness of TensorWrapper

035d76a

kshitij12345 marked this pull request as ready for review April 10, 2023 15:36

kshitij12345 requested review from zou3519 and Chillee as code owners April 10, 2023 15:36

kshitij12345 added the release notes: functorch release notes category; Pertaining to torch.func or pytorch/functorch label Apr 10, 2023

pytorchbot added the open source label Apr 10, 2023

update failing test

b74e160

kshitij12345 commented Apr 10, 2023

View reviewed changes

ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Apr 12, 2023

zou3519 approved these changes Apr 12, 2023

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Apr 13, 2023

pytorchmergebot added the merging label Apr 13, 2023

pytorchmergebot added the Merged label Apr 13, 2023

pytorchmergebot closed this in 2c337dd Apr 13, 2023

kshitij12345 mentioned this pull request Apr 25, 2023

RuntimeError: unwrapped_count > 0 INTERNAL ASSERT FAILED #99973

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix] update the condition for aliveness of TensorWrapper #98748

[fix] update the condition for aliveness of TensorWrapper #98748

kshitij12345 commented Apr 10, 2023 •

edited

Loading

pytorch-bot bot commented Apr 10, 2023 •

edited

Loading

kshitij12345 Apr 10, 2023 •

edited

Loading

kshitij12345 Apr 10, 2023

zou3519 Apr 10, 2023

zou3519 Apr 10, 2023

kshitij12345 Apr 11, 2023 •

edited

Loading

kshitij12345 Apr 11, 2023

kshitij12345 commented Apr 10, 2023

zou3519 left a comment

kshitij12345 commented Apr 13, 2023

pytorchmergebot commented Apr 13, 2023

	# Step 2: Compute vmap over computation with dual tensors
	def jvp(tangents):
	with fwAD.dual_level():
	dual_inputs = tuple(
	fwAD.make_dual(input, tangent.view_as(input)) for input, tangent in zip(inputs, tangents))
	_is_outputs_tuple, dual_outputs = _as_tuple(func(*dual_inputs), "outputs")
	output_info.append(_is_outputs_tuple)
	jv = []
	primal_outs = []
	for dual_out in dual_outputs:
	primal, tangent = fwAD.unpack_dual(dual_out)
	primal_outs.append(primal)
	if tangent is not None:
	jv.append(tangent)
	else:
	jv.append(torch.zeros_like(primal))
	output_info.append(primal_outs)
	return tuple(jv)

	outputs_before_split = _vmap(jvp)(tangents)

	with torch.enable_grad():
	is_inputs_tuple, inputs = _as_tuple(inputs, "inputs", "jvp")
	inputs = _grad_preprocess(inputs, create_graph=create_graph, need_graph=True)

	if v is not None:
	_, v = _as_tuple(v, "v", "jvp")
	v = _grad_preprocess(v, create_graph=create_graph, need_graph=False)
	_validate_v(v, inputs, is_inputs_tuple)
	else:
	if len(inputs) != 1 or inputs[0].nelement() != 1:
	raise RuntimeError("The vector v can only be None if the input to "
	"the user-provided function is a single Tensor "
	"with a single element.")

	outputs = func(*inputs)
	is_outputs_tuple, outputs = _as_tuple(outputs, "outputs of the user-provided function", "jvp")
	_check_requires_grad(outputs, "outputs", strict=strict)
	# The backward is linear so the value of grad_outputs is not important as
	# it won't appear in the double backward graph. We only need to ensure that
	# it does not contain inf or nan.
	grad_outputs = tuple(torch.zeros_like(out, requires_grad=True) for out in outputs)

	grad_inputs = _autograd_grad(outputs, inputs, grad_outputs, create_graph=True)
	_check_requires_grad(grad_inputs, "grad_inputs", strict=strict)

	if create_graph:
	with torch.enable_grad():
	grad_res = _autograd_grad(grad_inputs, grad_outputs, v, create_graph=create_graph)
	jvp = _fill_in_zeros(grad_res, outputs, strict, create_graph, "back_trick")
	else:
	grad_res = _autograd_grad(grad_inputs, grad_outputs, v, create_graph=create_graph)
	jvp = _fill_in_zeros(grad_res, outputs, strict, create_graph, "back_trick")

	# Cleanup objects and return them to the user
	outputs = _grad_postprocess(outputs, create_graph)
	jvp = _grad_postprocess(jvp, create_graph)

	return _tuple_postprocess(outputs, is_outputs_tuple), _tuple_postprocess(jvp, is_outputs_tuple)

[fix] update the condition for aliveness of TensorWrapper #98748

[fix] update the condition for aliveness of TensorWrapper #98748

Conversation

kshitij12345 commented Apr 10, 2023 • edited Loading

pytorch-bot bot commented Apr 10, 2023 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/98748

✅ No Failures

kshitij12345 Apr 10, 2023 • edited Loading

Choose a reason for hiding this comment

kshitij12345 Apr 10, 2023

Choose a reason for hiding this comment

zou3519 Apr 10, 2023

Choose a reason for hiding this comment

zou3519 Apr 10, 2023

Choose a reason for hiding this comment

kshitij12345 Apr 11, 2023 • edited Loading

Choose a reason for hiding this comment

kshitij12345 Apr 11, 2023

Choose a reason for hiding this comment

kshitij12345 commented Apr 10, 2023

zou3519 left a comment

Choose a reason for hiding this comment

kshitij12345 commented Apr 13, 2023

pytorchmergebot commented Apr 13, 2023

Merge started

kshitij12345 commented Apr 10, 2023 •

edited

Loading

pytorch-bot bot commented Apr 10, 2023 •

edited

Loading

kshitij12345 Apr 10, 2023 •

edited

Loading

kshitij12345 Apr 11, 2023 •

edited

Loading