Implement ScalarLoop in torch backend #958

Ch0ronomato · 2024-08-01T03:39:08Z

Description

Adds ScalarLoop for pytorch. I do it as a loop as opposed to trying to vectorize it...lmk if I should go that approach or not.

Related Issue

Closes #
Related to Adding conditionals for torch #939

Checklist

Checked that the pre-commit linting/style checks pass
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks)
If you are a pro: each commit corresponds to a relevant logical change

Type of change

pytensor/link/pytorch/dispatch/scalar.py

ricardoV94 · 2024-08-03T14:30:42Z

@Ch0ronomato thanks for taking a stab, I left some comments above

pytensor/link/pytorch/dispatch/scalar.py

codecov · 2024-08-11T19:13:36Z

Codecov Report

Attention: Patch coverage is 88.46154% with 6 lines in your changes missing coverage. Please review.

Project coverage is 81.96%. Comparing base (a377c22) to head (920f5a4).
Report is 21 commits behind head on main.

Files with missing lines	Patch %	Lines
pytensor/link/pytorch/dispatch/scalar.py	84.00%	2 Missing and 2 partials ⚠️
pytensor/link/pytorch/linker.py	50.00%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #958      +/-   ##
==========================================
+ Coverage   81.90%   81.96%   +0.06%     
==========================================
  Files         182      182              
  Lines       47879    47914      +35     
  Branches     8617     8632      +15     
==========================================
+ Hits        39214    39272      +58     
+ Misses       6492     6474      -18     
+ Partials     2173     2168       -5

Files with missing lines	Coverage Δ
pytensor/link/pytorch/dispatch/elemwise.py	`74.13% <100.00%> (+5.38%)`	⬆️
pytensor/link/pytorch/linker.py	`91.66% <50.00%> (-8.34%)`	⬇️
pytensor/link/pytorch/dispatch/scalar.py	`72.91% <84.00%> (+12.04%)`	⬆️

... and 17 files with indirect coverage changes

pytensor/link/pytorch/dispatch/scalar.py

tests/link/pytorch/test_basic.py

ricardoV94 · 2024-09-15T08:36:09Z

pytensor/link/pytorch/dispatch/scalar.py

+                carry = update(*carry, *constants)
+            return torch.stack(carry)
+
+    return torch.compiler.disable(scalar_loop)


Can you do recursive=False?

pytensor/link/pytorch/dispatch/scalar.py

Ch0ronomato · 2024-09-19T22:20:25Z

@ricardoV94 - these failures in the CI look a bit strange; i'll look into them before merging...hopefully they go away with merging main 😓

Ch0ronomato · 2024-10-20T23:20:17Z

@ricardoV94 #1031 is blocking the elemwise test - how do you want to proceed with this pr?

ricardoV94 · 2024-10-21T06:33:20Z

@ricardoV94 #1031 is blocking the elemwise test - how do you want to proceed with this pr?

If we can't elemwise it there's not much point to the ScalarLoop. Maybe we need to loop manually instead of vmap for this Op

ricardoV94

I suspect it's in the right direction, but need a bit more help to understand the new code if you can provide it :)

pytensor/link/pytorch/dispatch/elemwise.py

tests/link/pytorch/test_basic.py

Ch0ronomato · 2024-11-03T01:00:15Z

tests/link/pytorch/test_basic.py

+torch_elemwise = pytest.importorskip("pytensor.link.pytorch.dispatch.elemwise")
+
+
+@pytest.mark.parametrize("input_shapes", [[(5, 1, 1, 8), (3, 1, 1), (8,)]])


I set this up so we can try different shapes, but I stuck this one to get started. If you think we should add more lmk.

Ch0ronomato · 2024-11-03T01:01:04Z

tests/link/pytorch/test_basic.py

+    np.testing.assert_equal(mock_inner_func.f.call_count, len(result[0]))
+
+    expected_args = torch.FloatTensor([1.0] * (len(input_shapes) - 1) + [0.0]).unbind(0)
+    expected_calls = starmap(call, repeat(expected_args, mock_inner_func.f.call_count))


I'm bullish on itertools stuff but I think I saw mention earlier that list comprehensions are preferred. I can refactor it if so.

Ch0ronomato · 2024-11-03T01:02:06Z

pytensor/link/pytorch/linker.py

+        from torch import is_tensor
+
+        if is_tensor(out):
+            return out.cpu()


This will probably create conflict when one of my other PRs gets merged as an FYI.

Ch0ronomato · 2024-11-03T01:02:27Z

pytensor/link/pytorch/dispatch/elemwise.py

+                final_inputs[i] = list(layer)
+
+        # make sure we still have the same number of things
+        assert len(final_inputs) == len(shaped_inputs)


I can put these into the unit test if that's preferred now.

If the assert is executed every time at runtime, yes let's not do it here

Ch0ronomato · 2024-11-03T21:26:51Z

tests/link/pytorch/test_basic.py

+        torch.zeros(*input_shapes[-1])
+    ]
+    mock_inner_func = MagicMock()
+    ret_value = torch.rand(2, 2).unbind(0)


Maybe rename to expected

Ch0ronomato · 2024-11-03T21:27:20Z

tests/link/pytorch/test_basic.py

+    mock_inner_func.f.return_value = ret_value
+    elemwise_fn = torch_elemwise.elemwise_scalar_loop(mock_inner_func.f, None, None)
+    result = elemwise_fn(*args)
+    for actual, expected in zip(ret_value, result):


These are backwards fyi

ricardoV94 · 2024-11-12T14:27:44Z

pytensor/link/pytorch/dispatch/elemwise.py

+def elemwise_scalar_loop(base_fn, op, node, **kwargs):
+    """
+    ScalarLoop + Elemwise is too common
+    to not work, but @1031, vmap won't allow it.


Include full link instead of @1031

ricardoV94 · 2024-11-12T14:30:03Z

pytensor/link/pytorch/dispatch/elemwise.py

+        Elemwise._check_runtime_broadcast(node, inputs)
+        shaped_inputs = torch.broadcast_tensors(*inputs)
+        expected_size = shaped_inputs[0].numel()
+        final_inputs = [s.clone() for s in shaped_inputs]


Why .clone()?

This might be unnecessary now. We need the original number of dimensions for the outer loop. I could just grab that count instead.

ricardoV94 · 2024-11-12T14:31:29Z

pytensor/link/pytorch/dispatch/elemwise.py

+        for _ in range(shaped_inputs[0].dim() - 1):
+            for i, _ in enumerate(shaped_inputs):
+                layer = chain.from_iterable([s.unbind(0) for s in final_inputs[i]])
+                final_inputs[i] = list(layer)


What is more performant? Doing this nesting, or raveling all the inputs after broadcasting and doing a single unbind loop?

Either way, doesn't avoid the explicit broadcasting copy or does it?

Ahhhhh, this is basically like ravel you're right!

According to the torch docs, ravel only copies if needed. So there maybe cases where no coping happens

ricardoV94 · 2024-11-12T14:33:42Z

pytensor/link/pytorch/dispatch/elemwise.py

+        assert all(len(x.shape) == 0 for tensor in final_inputs for x in tensor)
+        res = [base_fn(*args) for args in zip(*final_inputs)]
+
+        return [torch.stack(tuple(out[i] for out in res)) for i in range(len(res[0]))]


Will this reintroduce the original shape? Say if the Elemwise of the Scalar Loop had output shape == (5, 3, 2) ?

ricardoV94 · 2024-11-12T14:36:23Z

pytensor/link/pytorch/dispatch/scalar.py

+            if len(node.outputs) == 2:
+                return carry[0], done
+            else:
+                return carry, done


Does this work?

Suggested change

if len(node.outputs) == 2:

return carry[0], done

else:

return carry, done

return *carry, done

ricardoV94 · 2024-11-12T14:40:06Z

tests/link/pytorch/test_basic.py

@@ -343,3 +380,44 @@ def test_pytorch_OpFromGraph():

    f = FunctionGraph([x, y, z], [out])
    compare_pytorch_and_py(f, [xv, yv, zv])
+
+
+def test_ScalarLoop_Elemwise():


Since there's a special condition for one or multiple carry, please test also both kinds of loop with multiple and single updates

ricardoV94 · 2024-11-12T14:41:09Z

tests/link/pytorch/test_basic.py

+torch_elemwise = pytest.importorskip("pytensor.link.pytorch.dispatch.elemwise")
+
+
+@pytest.mark.parametrize("input_shapes", [[(5, 1, 1, 8), (3, 1, 1), (8,)]])


I don't like this test. Just use these shapes in the test above and let the numerical checks do its job

Okay sounds good. I made this to try to lock down the implementation a bit. I also added it for understanding, does the method make sense now?

ricardoV94 · 2024-11-12T14:44:02Z

tests/link/pytorch/test_basic.py

+    n_steps = pt.scalar("n_steps", dtype="int32")
+    x0 = pt.vector("x0", dtype="float32")


Add a second carry, say of type tensor(shape=(7, 3, 1) so it broadcasts with the vector x0.

This will make sure multiple carry are working and we are getting the right shape outputs.

Or just use the shapes you had in the test below, that's fine

ricardoV94 · 2024-11-12T15:45:10Z

How is unbind(0) different than [x[i] for i in x.size()[0]]?

Ch0ronomato · 2024-11-12T16:09:28Z

How is unbind(0) different than [x[i] for i in x.size()[0]]?

https://discuss.pytorch.org/t/the-purpose-of-unbind/98648

It's essentially the same, maybe faster

ricardoV94 · 2024-11-12T16:26:40Z

How is unbind(0) different than [x[i] for i in x.size()[0]]?

https://discuss.pytorch.org/t/the-purpose-of-unbind/98648

It's essentially the same, maybe faster

But if we index in the loop after raveling we don't need all the slices in memory. This is looking like a custom Elemwise with explicit broadcasting:

bcasted_inputs = boradcast_arrays(*inputs)
raveled_inputs = [inp.ravel() for inp in bcasted_inputs]

out_shape = bcasted_inputs[0].size()
out_size = out_shape.nelem()
raveled_outputs = [torch.empty(out_size, dtype=out.dtype) for out in node.outputs]

for i in range(out_size):
  core_outs = core_func(*(inp[i] for i in raveled_inputs))
  if len(n_outputs) == 1:
    raveled_outputs[0][i] = core_outs
  else:
    for o in range(n_outputs):
      raveled_outputs[o][i] = core_outs[o]

outputs = tuple(out.view(out_shape) for out in raveled_outputs)
if n_outputs == 1:
  return outputs[0]
else:
  return outputs

Also note that nothing is specific to scalar loop, so it can be a (non-performant) fallback for all sorts of Elemwise

Ch0ronomato · 2024-11-12T16:43:50Z

That looks great. I think we'll still need to have some dispatch logic to know what can't be vmap'd; do we want to keep the current method? How does your approach merge with #1032?

ricardoV94 · 2024-11-12T16:47:00Z

That looks great. I think we'll still need to have some dispatch logic to know what can't be vmap'd; do we want to keep the current method?

Yes this can be a fallback only for registered Ops (and specifically only ScalarLoop at the time being).

ricardoV94 · 2024-11-12T16:47:40Z

If my suggestion works it should be better than the nested unbind unless torch is really weird

Add for loop based scalar loop

5934e09

Ch0ronomato commented Aug 1, 2024

View reviewed changes

pytensor/link/pytorch/dispatch/scalar.py Outdated Show resolved Hide resolved

Ch0ronomato commented Aug 1, 2024

View reviewed changes

pytensor/link/pytorch/dispatch/scalar.py Show resolved Hide resolved

ricardoV94 requested changes Aug 3, 2024

View reviewed changes

ricardoV94 added enhancement New feature or request torch PyTorch backend labels Aug 3, 2024

Pass all loop tests

a3bb433

Ch0ronomato requested a review from ricardoV94 August 11, 2024 18:56

Ch0ronomato commented Aug 11, 2024

View reviewed changes

pytensor/link/pytorch/dispatch/scalar.py Outdated Show resolved Hide resolved

Ch0ronomato commented Aug 12, 2024

View reviewed changes

pytensor/link/pytorch/dispatch/scalar.py Outdated Show resolved Hide resolved

pytensor/link/pytorch/dispatch/scalar.py Outdated Show resolved Hide resolved

Fetch constants from op

cdc06f9

ricardoV94 changed the title ~~Add torch scalar loop~~ Implement ScalarLoop in torch backend Sep 1, 2024

ricardoV94 reviewed Sep 1, 2024

View reviewed changes

tests/link/pytorch/test_basic.py Show resolved Hide resolved

Ch0ronomato and others added 4 commits September 8, 2024 16:33

Add while loop test

aa00703

Fix while loop and nasty stack over dtypes

a0b23cd

Disable compile here based on CI result

2d79d23

Fix mypy signature

e4c2b9d

ricardoV94 reviewed Sep 15, 2024

View reviewed changes

pytensor/link/pytorch/dispatch/scalar.py Outdated Show resolved Hide resolved

Ian Schweer added 6 commits September 19, 2024 13:49

Remove unnecessary torch stack

ebaf641

Only call .cpu when necessary

ad04464

Recursive false for torch compiler

cb7e4db

Merge branch 'main' into scalarloop

eee63d4

Add elemwise test

f0507d5

Late import torch

c894b08

Ch0ronomato requested a review from ricardoV94 September 19, 2024 22:19

ricardoV94 mentioned this pull request Oct 3, 2024

Implement all Ops in PyTorch (help welcome!) #821

Open

48 tasks

Ch0ronomato added 2 commits October 20, 2024 16:25

Merge branch 'main' into scalarloop

b28bdf6

Fix lint

beb4440

Do iteration instead of vmap for elemwise

c467a34

ricardoV94 reviewed Oct 29, 2024

View reviewed changes

pytensor/link/pytorch/dispatch/elemwise.py Show resolved Hide resolved

pytensor/link/pytorch/dispatch/elemwise.py Outdated Show resolved Hide resolved

tests/link/pytorch/test_basic.py Outdated Show resolved Hide resolved

Ch0ronomato added 2 commits November 2, 2024 16:57

Clean up and add description

067761f

Add unit test to verify iteration

920f5a4

Ch0ronomato commented Nov 3, 2024

View reviewed changes

Ch0ronomato requested a review from ricardoV94 November 3, 2024 01:01

Ch0ronomato commented Nov 3, 2024

View reviewed changes

ricardoV94 reviewed Nov 12, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement ScalarLoop in torch backend #958

Implement ScalarLoop in torch backend #958

Ch0ronomato commented Aug 1, 2024

ricardoV94 commented Aug 3, 2024

codecov bot commented Aug 11, 2024 •

edited

Loading

ricardoV94 Sep 15, 2024

Ch0ronomato commented Sep 19, 2024

Ch0ronomato commented Oct 20, 2024

ricardoV94 commented Oct 21, 2024

ricardoV94 left a comment

Ch0ronomato Nov 3, 2024

Ch0ronomato Nov 3, 2024

Ch0ronomato Nov 3, 2024

Ch0ronomato Nov 3, 2024

ricardoV94 Nov 12, 2024 •

edited

Loading

Ch0ronomato Nov 3, 2024

Ch0ronomato Nov 3, 2024

ricardoV94 Nov 12, 2024

ricardoV94 Nov 12, 2024

Ch0ronomato Nov 12, 2024

ricardoV94 Nov 12, 2024 •

edited

Loading

Ch0ronomato Nov 12, 2024

ricardoV94 Nov 12, 2024

ricardoV94 Nov 12, 2024

ricardoV94 Nov 12, 2024 •

edited

Loading

ricardoV94 Nov 12, 2024 •

edited

Loading

Ch0ronomato Nov 12, 2024

ricardoV94 Nov 12, 2024 •

edited

Loading

ricardoV94 Nov 12, 2024

ricardoV94 commented Nov 12, 2024 •

edited

Loading

Ch0ronomato commented Nov 12, 2024

ricardoV94 commented Nov 12, 2024 •

edited

Loading

Ch0ronomato commented Nov 12, 2024 •

edited

Loading

ricardoV94 commented Nov 12, 2024

ricardoV94 commented Nov 12, 2024

		torch_elemwise = pytest.importorskip("pytensor.link.pytorch.dispatch.elemwise")


		@pytest.mark.parametrize("input_shapes", [[(5, 1, 1, 8), (3, 1, 1), (8,)]])

		n_steps = pt.scalar("n_steps", dtype="int32")
		x0 = pt.vector("x0", dtype="float32")

Implement ScalarLoop in torch backend #958

Are you sure you want to change the base?

Implement ScalarLoop in torch backend #958

Conversation

Ch0ronomato commented Aug 1, 2024

Description

Related Issue

Checklist

Type of change

ricardoV94 commented Aug 3, 2024

codecov bot commented Aug 11, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Ch0ronomato commented Sep 19, 2024

Ch0ronomato commented Oct 20, 2024

ricardoV94 commented Oct 21, 2024

ricardoV94 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ricardoV94 Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ricardoV94 Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ricardoV94 Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

ricardoV94 Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ricardoV94 Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ricardoV94 commented Nov 12, 2024 • edited Loading

Ch0ronomato commented Nov 12, 2024

ricardoV94 commented Nov 12, 2024 • edited Loading

Ch0ronomato commented Nov 12, 2024 • edited Loading

ricardoV94 commented Nov 12, 2024

ricardoV94 commented Nov 12, 2024

codecov bot commented Aug 11, 2024 •

edited

Loading

ricardoV94 Nov 12, 2024 •

edited

Loading

ricardoV94 Nov 12, 2024 •

edited

Loading

ricardoV94 Nov 12, 2024 •

edited

Loading

ricardoV94 Nov 12, 2024 •

edited

Loading

ricardoV94 Nov 12, 2024 •

edited

Loading

ricardoV94 commented Nov 12, 2024 •

edited

Loading

ricardoV94 commented Nov 12, 2024 •

edited

Loading

Ch0ronomato commented Nov 12, 2024 •

edited

Loading