[ONNX] add cast operator after reduce to match desired dtype #100700

TaiPhamD · 2023-05-05T09:24:53Z

This PR conditionally inserts a cast operator after a reduction operation to match the specified dtype in the exported ONNX model. The code changes affect opset9, and opset13.

I understand there's an automatic upcast to int64 before reduction most likely to prevent overflow so I left that alone and only conditionally add casting back to desired dtype.

Test int32

import torch
import onnx
a = torch.tensor([10, 20, 30, 80], dtype=torch.int32)
def test():
    class SumInt32(torch.nn.Module):
        def forward(self, a):
            return torch.sum(a, dtype=torch.int32)

    sumi = SumInt32().eval()
    assert sumi(a).dtype == torch.int32
    print("Torch model output type matches input type")

    torch.onnx.export(sumi, (a), "/tmp/sumi_int32.onnx", opset_version=12)
    model = onnx.load("/tmp/sumi_int32.onnx")

    assert model.graph.output[0].type.tensor_type.elem_type == onnx.TensorProto.INT32
    print("ONNX model output type matches input type")
test()

Test int64

import onnx
import torch

a = torch.tensor([10, 20, 30, 80], dtype=torch.int64)


def test():
    class SumInt64(torch.nn.Module):
        def forward(self, a):
            return torch.sum(a, dtype=torch.int64)

    sumi = SumInt64().eval()
    assert sumi(a).dtype == torch.int64
    print("Torch model output type matches input type")
    torch.onnx.export(sumi, (a), "/tmp/sumi_int64.onnx", opset_version=12)
    model = onnx.load("/tmp/sumi_int64.onnx")
    assert model.graph.output[0].type.tensor_type.elem_type == onnx.TensorProto.INT64
    print("ONNX model output type matches input type")


test()

Fixes #100097

…desired dtype

pytorch-bot · 2023-05-05T09:24:56Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/100700

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 1 Pending

As of commit 274c2a8:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

linux-foundation-easycla · 2023-05-05T09:24:57Z

The committers listed above are authorized under a signed CLA.

✅ login: TaiPhamD (42696e5)

TaiPhamD · 2023-05-05T11:21:39Z

I found a few issues with this one so will close until i fix it then will re-open.

… the data input dtype

thiagocrepaldi

Thank you for the PR. I have pointed out an issue in which how you accessed the scalar tpe for a torch._C.Value. Please use the recommended way

thiagocrepaldi · 2023-05-05T17:24:26Z

torch/onnx/symbolic_opset13.py

+            result = symbolic(g, self)
+            if dtype_onnx is not None:
+                result_dtype_scalar = result.type().scalarType()
+                result_dtype_onnx = _type_utils.JitScalarType._from_name(result_dtype_scalar).onnx_type()


parsing type from name is the #1 source of problems (because of the previous comment). Please only use from_value or from_dtype.

thiagocrepaldi · 2023-05-05T17:30:56Z

torch/onnx/symbolic_opset13.py

-            return symbolic(g, self)
+            result = symbolic(g, self)
+            if dtype_onnx is not None:
+                result_dtype_scalar = result.type().scalarType()


This can cause segfault/assert crashes when model is torch.jit.script.
To prevent this, use exclusively JitScalarType's public APIs to extract type information from torch._C.Value nodes in a safe way.

thank you I will make that change.

…fe api

TaiPhamD · 2023-05-05T19:23:14Z

sorry for the lint issue. I used flake8 to test for lint issues locally but just realized the CI uses lintrunner so I'll make that change.

thiagocrepaldi · 2023-05-05T19:52:14Z

sorry for the lint issue. I used flake8 to test for lint issues locally but just realized the CI uses lintrunner so I'll make that change.

I always do make setup_lint, lintrunner init and lintrunner -a before pushing a PR to make sure linting is fine

BowenBao · 2023-05-05T20:25:22Z

@pytorchbot merge

pytorchmergebot · 2023-05-05T20:27:49Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

add conditional cast back after reduce if result dtype doesn't match …

42696e5

…desired dtype

TaiPhamD requested review from BowenBao and abock as code owners May 5, 2023 09:24

pytorch-bot bot added the release notes: onnx torch.onnx related changes that should show up in the release notes label May 5, 2023

pytorchbot added the open source label May 5, 2023

TaiPhamD changed the title ~~add cast operator after sum to match desired dtype~~ [ONNX] add cast operator after sum to match desired dtype May 5, 2023

TaiPhamD changed the title ~~[ONNX] add cast operator after sum to match desired dtype~~ [ONNX] add cast operator after reduce to match desired dtype May 5, 2023

remove comment to be consistent

df66107

TaiPhamD closed this May 5, 2023

the desired dtype should be from the reduction operator dtype and not…

3ce9c15

… the data input dtype

TaiPhamD reopened this May 5, 2023

fix lint

3dd8642

BowenBao added module: onnx Related to torch.onnx topic: bug fixes topic category labels May 5, 2023

thiagocrepaldi requested changes May 5, 2023

View reviewed changes

TaiPhamD added 2 commits May 5, 2023 11:00

use JitScalarType.from_value as suggested instead of the private unsa…

0bc1eac

…fe api

reuse dtype_onnx variable

69a7762

TaiPhamD requested a review from thiagocrepaldi May 5, 2023 18:03

use lintrunner instead of flake8 to match CI linter

274c2a8

thiagocrepaldi approved these changes May 5, 2023

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 5, 2023

pytorchmergebot added the merging label May 5, 2023

pytorchmergebot added the Merged label May 6, 2023

pytorchmergebot removed the merging label May 6, 2023

pytorchmergebot closed this in 3362c1d May 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ONNX] add cast operator after reduce to match desired dtype #100700

[ONNX] add cast operator after reduce to match desired dtype #100700

TaiPhamD commented May 5, 2023 •

edited

Loading

pytorch-bot bot commented May 5, 2023 •

edited

Loading

linux-foundation-easycla bot commented May 5, 2023 •

edited

Loading

TaiPhamD commented May 5, 2023

thiagocrepaldi left a comment

thiagocrepaldi May 5, 2023

thiagocrepaldi May 5, 2023

TaiPhamD May 5, 2023

TaiPhamD commented May 5, 2023

thiagocrepaldi commented May 5, 2023

BowenBao commented May 5, 2023

pytorchmergebot commented May 5, 2023

[ONNX] add cast operator after reduce to match desired dtype #100700

[ONNX] add cast operator after reduce to match desired dtype #100700

Conversation

TaiPhamD commented May 5, 2023 • edited Loading

Test int32

Test int64

pytorch-bot bot commented May 5, 2023 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/100700

⏳ No Failures, 1 Pending

linux-foundation-easycla bot commented May 5, 2023 • edited Loading

TaiPhamD commented May 5, 2023

thiagocrepaldi left a comment

Choose a reason for hiding this comment

thiagocrepaldi May 5, 2023

Choose a reason for hiding this comment

thiagocrepaldi May 5, 2023

Choose a reason for hiding this comment

TaiPhamD May 5, 2023

Choose a reason for hiding this comment

TaiPhamD commented May 5, 2023

thiagocrepaldi commented May 5, 2023

BowenBao commented May 5, 2023

pytorchmergebot commented May 5, 2023

Merge started

TaiPhamD commented May 5, 2023 •

edited

Loading

pytorch-bot bot commented May 5, 2023 •

edited

Loading

linux-foundation-easycla bot commented May 5, 2023 •

edited

Loading