Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ttnn.max and min operations throw runtime error when called with shapes not aligned to tiles #8638

Open
nemanjagrujic opened this issue May 20, 2024 · 0 comments
Labels
bug Something isn't working GS op_cat: reduces WH

Comments

@nemanjagrujic
Copy link
Contributor

When testing ttnn ops with TILE layout and random shapes like [4, 7, 21, 133] most of the operations work correctly.

But ttnn.max and ttnn.min operation fail when last dimension is not aligned to TILE bounderies. We get runtime error like:

RuntimeError: shape '[198, 32]' is invalid for input of size 7168

Problem is observed on both GS and WH cards.

To Reproduce
Steps to reproduce the behavior:

  1. Checkout branch ngrujic/op_bug_unit_tests (soon to be merged into main).
  2. Run unit test tests/ttnn/python_api_testing/non_working_unit_tests/wormhole/test_min_max.py using this command:
pytest tests/ttnn/python_api_testing/non_working_unit_tests/wormhole/test_min_max.py

Expected behavior
There are few test cases presented in the unit test, which are failing with:

E       RuntimeError: TT_THROW @ ../ttnn/cpp/ttnn/operations/core.hpp:78: tt::exception
E       info:
E       Unable to reshape a tensor in TILE_LAYOUT to non-tile height and width! Please convert the tensor to ROW_MAJOR_LAYOUT first.
E       backtrace:
E        --- /home/ubuntu/tt-metal/ttnn/ttnn/_ttnn.so(+0x133996) [0x7f93d5ca2996]
E        --- /home/ubuntu/tt-metal/ttnn/ttnn/_ttnn.so(+0x17dd22) [0x7f93d5cecd22]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(PyCFunction_Call+0x59) [0x5d5499]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x6eb0) [0x54d9f0]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(PyObject_Call+0x62) [0x5d4c12]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x1f26) [0x548a66]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(PyObject_Call+0x62) [0x5d4c12]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x1f26) [0x548a66]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(PyObject_Call+0x62) [0x5d4c12]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x1f26) [0x548a66]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(PyObject_Call+0x62) [0x5d4c12]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x1f26) [0x548a66]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3() [0x57a42e]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyObject_MakeTpCall+0x296) [0x5d6066]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x6329) [0x54ce69]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(PyObject_Call+0x62) [0x5d4c12]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x1f26) [0x548a66]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(PyObject_Call+0x62) [0x5d4c12]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x1f26) [0x548a66]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(PyObject_Call+0x62) [0x5d4c12]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x1f26) [0x548a66]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3() [0x57a4af]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyObject_MakeTpCall+0x296) [0x5d6066]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x690a) [0x54d44a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x1876) [0x5483b6]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x1b6) [0x5d5846]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x725) [0x547265]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(PyObject_Call+0x62) [0x5d4c12]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x1f26) [0x548a66]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(PyObject_Call+0x62) [0x5d4c12]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x1f26) [0x548a66]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x5d69) [0x54c8a9]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3() [0x4e1a5e]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x5d69) [0x54c8a9]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3() [0x57a4af]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyObject_MakeTpCall+0x296) [0x5d6066]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x690a) [0x54d44a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x1b6) [0x5d5846]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x907) [0x547447]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x1b6) [0x5d5846]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(PyObject_Call+0x62) [0x5d4c12]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x1f26) [0x548a66]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x5d69) [0x54c8a9]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3() [0x4e1a5e]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x5d69) [0x54c8a9]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3() [0x57a4af]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(PyObject_Call+0x25e) [0x5d4e0e]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x1f26) [0x548a66]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x725) [0x547265]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3() [0x4e1bd0]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x1876) [0x5483b6]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(PyObject_Call+0x62) [0x5d4c12]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x1f26) [0x548a66]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x725) [0x547265]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalCodeWithName+0x26a) [0x54552a]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x393) [0x5d5a23]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyEval_EvalFrameDefault+0x1876) [0x5483b6]
E        --- /home/ubuntu/tt-metal/python_env/bin/python3(_PyFunction_Vectorcall+0x1b6) [0x5d5846]

ttnn/ttnn/decorators.py:766: RuntimeError

During handling of the above exception, another exception occurred:

input_shape = [(198, 216)], dtype = [<DataType.BFLOAT16: 0>], dlayout = [<Layout.TILE: 1>]
in_mem_config = [tt::tt_metal::MemoryConfig(memory_layout=TensorMemoryLayout::INTERLEAVED,buffer_type=BufferType::DRAM,shard_spec=std::nullopt)]
output_mem_config = tt::tt_metal::MemoryConfig(memory_layout=TensorMemoryLayout::INTERLEAVED,buffer_type=BufferType::DRAM,shard_spec=std::nullopt), data_seed = 4171614, dim = -1
device = <tt_lib.device.Device object at 0x7f93d9445770>

    @pytest.mark.parametrize(
        "input_shape, dtype, dlayout, in_mem_config, output_mem_config, data_seed, dim",
        (test_sweep_args),
    )
    def test_min(input_shape, dtype, dlayout, in_mem_config, output_mem_config, data_seed, dim, device):
>       run_min_tests(input_shape, dtype, dlayout, in_mem_config, output_mem_config, data_seed, dim, device)

tests/ttnn/python_api_testing/non_working_unit_tests/wormhole/test_min_max.py:106: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/ttnn/python_api_testing/non_working_unit_tests/wormhole/test_min_max.py:36: in run_min_tests
    raise e
tests/ttnn/python_api_testing/non_working_unit_tests/wormhole/test_min_max.py:24: in run_min_tests
    tt_result = ttnn_ops.min(
tests/ttnn/python_api_testing/sweep_tests/ttnn_ops.py:2658: in min
    t1 = ttnn.min(t0, dim=dim, memory_config=memory_config_to_ttnn(output_mem_config))
ttnn/ttnn/decorators.py:701: in __call__
    output = self.decorated_function(*function_args, **function_kwargs)
ttnn/ttnn/decorators.py:626: in call_wrapper
    output = decorated_function(*function_args, **function_kwargs)
ttnn/ttnn/decorators.py:411: in call_wrapper
    return function(*function_args, **function_kwargs)
ttnn/ttnn/operations/reduction.py:317: in min
    output_tensor = ttnn.reshape(output_tensor, ttnn.Shape(output_shape, padded_output_shape))
ttnn/ttnn/decorators.py:701: in __call__
    output = self.decorated_function(*function_args, **function_kwargs)
ttnn/ttnn/decorators.py:580: in call_wrapper
    return decorated_function(*function_args, **function_kwargs)
ttnn/ttnn/decorators.py:411: in call_wrapper
    return function(*function_args, **function_kwargs)
ttnn/ttnn/decorators.py:560: in call_wrapper
    output = golden_function(*function_args, **function_kwargs)
ttnn/ttnn/decorators.py:539: in golden_function
    output = self.golden_function(*updated_function_args, **updated_function_kwargs)
ttnn/ttnn/operations/core.py:129: in _golden_function
    return input_tensor.reshape(shape).contiguous().clone()
ttnn/ttnn/operations/core.py:312: in __torch_function__
    return super().__torch_function__(func, types, args, kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

cls = <class 'ttnn.operations.core.TorchTensor'>, func = <built-in method  of PyCapsule object at 0x7f9417140fc0>, types = (<class 'torch.Tensor'>,)
args = (TorchTensor([[[[-98.5000,   0.0000,   0.0000,  ...,   0.0000,   0.0000,
                  0.0000],
               [-9...  0.0000,   0.0000,   0.0000,  ...,   0.0000,   0.0000,
                  0.0000]]]], dtype=torch.bfloat16), (198, 32))
kwargs = {}

    @classmethod
    def __torch_function__(cls, func, types, args=(), kwargs=None):
        """
        This __torch_function__ implementation wraps subclasses such that
        methods called on subclasses return a subclass instance instead of
        a ``torch.Tensor`` instance.
    
        One corollary to this is that you need coverage for torch.Tensor
        methods if implementing __torch_function__ for subclasses.
    
        We recommend always calling ``super().__torch_function__`` as the base
        case when doing the above.
    
        While not mandatory, we recommend making `__torch_function__` a classmethod.
        """
        if kwargs is None:
            kwargs = {}
    
        if not all(issubclass(cls, t) for t in types):
            return NotImplemented
    
        with _C.DisableTorchFunctionSubclass():
>           ret = func(*args, **kwargs)
E           RuntimeError: shape '[198, 32]' is invalid for input of size 7168

Running sweeps
To get additional information and results for different combinations of input shapes, types, layouts and memory configs for which this operation was tested you can also run locally sweeps and check the results. To do this you should:

  1. Run non working sweep by using pytest tests/ttnn/python_api_testing/sweep_tests/run_sweep_test.py --input-path tests/ttnn/python_api_testing/sweep_tests/test_configs/ci_sweep_tests_broken/wormhole/ttnn_min_test.yaml --input-method cli --cli-input results_ttnn_min_wh
  2. After the run is completed all test sweeps results should be available inside specified output directory.

There are more sweeps which you can try by changing the above command to target files:

tests/ttnn/python_api_testing/sweep_tests/test_configs/ci_sweep_tests_broken/wormhole/ttnn_min_test.yaml 
tests/ttnn/python_api_testing/sweep_tests/test_configs/ci_sweep_tests_broken/wormhole/ttnn_max_test.yaml 
tests/ttnn/python_api_testing/sweep_tests/test_configs/ci_sweep_tests_broken/grayskull/ttnn_min_test.yaml 
tests/ttnn/python_api_testing/sweep_tests/test_configs/ci_sweep_tests_broken/grayskull/ttnn_max_test.yaml 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working GS op_cat: reduces WH
Projects
None yet
Development

No branches or pull requests

1 participant