Add ONNX registry tutorial #2578

titaiwangms · 2023-10-03T02:20:11Z

Follow up #2541, this PR adds step by step guide to demonstrate end to end solution of how to address unsupported ATen/ONNX operators issue. Please review the last commit.

NOTE: To avoid merge conflicts with the previous two PRs (ONNX 1 and ONNX 2), this PR builds on top of them like ghstack. You can review the last commit.

cc @thiagocrepaldi @abock @justinchuby @BowenBao @wschin

pytorch-bot · 2023-10-03T02:20:14Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2578

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

beginner_source/onnx/export_simple_model_to_onnx_tutorial.py

thiagocrepaldi · 2023-10-03T21:25:40Z

beginner_source/onnx/intro_onnx.py

+.. note::
+   This tutorial leverages `onnxscript <https://github.com/microsoft/onnxscript#readme>`__
+   to create custom ONNX operators. onnxscript is a Python library that allows users to
+   create custom ONNX operators in Python. It is a prerequisite learning material for
+   this tutorial. Please make sure you have read the onnxscript tutorial before proceeding.
+


This is only relevant to the ONNX registry tutorial and probably should be moved there.

Users that don't depend on custom operators don't need to sped extra time learning ONNX Script to export models with default operators. This adds unnecessary extra requisites to something we want to be as simple as possible for the users

This was requested by you in the original one. I am not sure why this is reviewed again. I think it's important to point out where symbolic functions are from in dynamo export, and it's a note with specifying custom operators. I like the idea putting it in this page.

I apologize for any miscommunication. Code-review is an iterative process, and sometimes reviewers have a better view of the final product on a second or third round.

This series of tutorials is designed to present content in increasing levels complexity, focused on a single task per tutorial. The first of the series only tackles dependency installation and presents a table of content to the other tutorials with specific topics of their own.

However, this note is incorrectly stating ONNX Script is a prerequisite learning material for this tutorial, which is only true for the Introduction to ONNX Registry tutorial. Conceptually, ONNX Script is not needed for Introduction to ONNX nor Export a PyTorch model to ONNX tutorials. The new ONNX exporter API might be a lot to tackle for new users, adding extra reading is an entry point barrier for beginners.

The note also dumps references to advanced concepts such as "custom operator" before users have a chance to get acquainted with the basic export API or even viewing how a simple model with standard ops look like.

Please move this piece to tutorial 3.

beginner_source/onnx/intro_onnx.py

beginner_source/onnx/onnx_registry_tutorial.py

beginner_source/onnx/intro_onnx.py

thiagocrepaldi · 2023-10-03T22:06:55Z

beginner_source/onnx/onnx_registry_tutorial.py

+# If the model cannot be exported to ONNX, for instance, :class:`aten::add.Tensor` is not supported
+# by ONNX The error message can be found, and is as follows (for example,  ``aten::add.Tensor``):
+#    ``RuntimeErrorWithDiagnostic: Unsupported FX nodes: {'call_function': ['aten.add.Tensor']}. ``


Suggested change

# If the model cannot be exported to ONNX, for instance, :class:`aten::add.Tensor` is not supported

# by ONNX The error message can be found, and is as follows (for example, ``aten::add.Tensor``):

# ``RuntimeErrorWithDiagnostic: Unsupported FX nodes: {'call_function': ['aten.add.Tensor']}. ``

# In this section, we will assume that `aten::add.Tensor` is not supported by the ONNX registry, and we will demonstrate how to support it.

# When a model cannot be exported to ONNX due to a missing operator, the ONNX exporter will show an error message with something similar to:

#

# .. code-block:: python

# RuntimeErrorWithDiagnostic: Unsupported FX nodes: {'call_function': ['aten.add.Tensor']}.

This section is focussed on the adding missing aten operators, but the example actually replaces an existing one, which is the topic of the next section.

I think we should add support to a truly missing operator? Maybe a variant of an existing operator that we don't really care about in having in the official registry?

I am trying to prevent confusion between the 3 scenarios. The missing operator section is using a trick of replacing existing operator, which beginners might find confusing

This is a good point. I tried "removing add" and it's also weird. I guess the idea of leaving a missing operator for tutorial purpose is contradict to converter team goal. So here, instead, I do my best to emphasize that we are adding support to existing operator, and pretend it wasn't there. Honestly, I guess if there was one in purpose missing operator, users might start contributing/asking about it.

The difference between three should not be affected by this though.

Missing ATen can be addressed with existing ONNX ops.

Changed ops from ONNX namesapce to com.microsoft

Support empty custom ops (using ORT as example backend)

We want to advertise the first one, since it's an advantage that onnxscript brings to us, also more descriptive and easy to understand as an example, but the third one has the most existing users.

I do agree 100% with you that we should not miss an aten operator on purpose just for the sake of the tutorial. I would never suggest that :)

In fact, I am Ok in keeping the current approach if we cannot find another way that frees user from hacking the registry themselves. In that case, I would create a new section/snippet block just for the un-registration part and make it part of the tutorial. That would 1) teach how to remove an operator from registry and 2) create a "physical" separation between the "adding" new operator from the "removing" the operator in the tutorial.

Nonetheless, I would like to inquire about options that make things simpler for the user.

For example, can we leverage the matching logic to register an equivalent operator with different names(overloads)? Maybe leveraging the default overload or another overload the . Can we try to register a aten::add.Tensor.default that we don't implement today because of our onnx-fx interpreter can match aten::add.Tensor.default to aten::add.Tensor that already exists?

Maybe something similar to what is being discussed at pytorch/pytorch#109966 in which aten::split(tensor,dim) already exists but a aten::split(tensor, dim, drop_remainder=False) is added with the same behavior? #109966 could be improved to make this kind of scenario possible, if not already

For example, can we leverage the matching logic to register an equivalent operator with different names(overloads)? Maybe leveraging the default overload or another overload the . Can we try to register a aten::add.Tensor.default that we don't implement today because of our onnx-fx interpreter can match aten::add.Tensor.default to aten::add.Tensor that already exists?

In this approach, we would have to create that ATen overload ahead in torch side to demonstrate it, which is over complicated the three examples here. Otherwise, we never get the overload from FX.

Maybe something similar to what is being discussed at pytorch/pytorch#109966 in which aten::split(tensor,dim) already exists but a aten::split(tensor, dim, drop_remainder=False) is added with the same behavior? #109966 could be improved to make this kind of scenario possible, if not already

In that case, user will not be able to overwrite, because schema doesn't match. You have to remove the overload first, and register it with the new schema. Schema is a primary key to the whole matching, kwargs is part of them.

Overall, I think this tutorial is full of new things, and we should expose overload usages later. Maybe after having some feedback on registry, and then we can go from there.

beginner_source/onnx/onnx_registry_tutorial.py

svekars · 2023-10-04T16:36:28Z

Please resolve the merge conflicts

thiagocrepaldi · 2023-10-04T15:28:09Z

beginner_source/onnx/intro_onnx.py

+.. note::
+   This tutorial leverages `onnxscript <https://github.com/microsoft/onnxscript#readme>`__
+   to create custom ONNX operators. onnxscript is a Python library that allows users to
+   create custom ONNX operators in Python. It is a prerequisite learning material for
+   this tutorial. Please make sure you have read the onnxscript tutorial before proceeding.
+


I apologize for any miscommunication. Code-review is an iterative process, and sometimes reviewers have a better view of the final product on a second or third round.

This series of tutorials is designed to present content in increasing levels complexity, focused on a single task per tutorial. The first of the series only tackles dependency installation and presents a table of content to the other tutorials with specific topics of their own.

However, this note is incorrectly stating ONNX Script is a prerequisite learning material for this tutorial, which is only true for the Introduction to ONNX Registry tutorial. Conceptually, ONNX Script is not needed for Introduction to ONNX nor Export a PyTorch model to ONNX tutorials. The new ONNX exporter API might be a lot to tackle for new users, adding extra reading is an entry point barrier for beginners.

The note also dumps references to advanced concepts such as "custom operator" before users have a chance to get acquainted with the basic export API or even viewing how a simple model with standard ops look like.

Please move this piece to tutorial 3.

beginner_source/onnx/intro_onnx.py

thiagocrepaldi · 2023-10-04T16:05:49Z

beginner_source/onnx/onnx_registry_tutorial.py

+import torch
+print(torch.__version__)
+torch.manual_seed(191009)  # set the seed for reproducibility
+
+import onnxscript  # pip install onnxscript
+print(onnxscript.__version__)
+
+# NOTE: opset18 is the only version of ONNX operators we are
+# using in torch.onnx.dynamo_export for now.
+from onnxscript import opset18
+
+import onnxruntime  # pip install onnxruntime
+print(onnxruntime.__version__)


No, users will never start the tutorial on ONNX Registry. I've made sure of this by creating the Backends menu with Introduction to ONNX as the landing page as shown below

The users are forced to go through intro_onnx.py and then click on the Introduction to ONNX Registry, if they want.

I am OK in adding a Verifying the installation at intro_onnx.py right after Dependencies with this snippet. We can do it in a separate PR, in case you prefer, but this section does not belong to onnx_registry_tutorial.py and we should remove it

thiagocrepaldi · 2023-10-04T17:07:27Z

beginner_source/onnx/onnx_registry_tutorial.py

+# This tutorial is an introduction to ONNX registry, which
+# empowers users to create their own ONNX registries enabling
+# them to address unsupported operators in ONNX.


I do like part how the tutorial express what is the ultimate goal. This is pretty clear. In fact, most of the text is great in showing what is next.

My suggestion is to smooth out the transition from a happy path scenario ("I exported a model that didnt anything extra") presented on tutorial 2 to advanced scenario requiring knowledge of operator implementation that is needed by tutorial 3.

So I proposed a modified version of the original text from the "Unsupported ATen operators" section to serve as an additional introduction to ATen operator (maintained by pytorch core), and how ONNX operators (maintained by onnx exporter) interacts with the ONNX registry.

This is an attempt to build increasing levels of information before we get to coding. We have at least 3 concepts here. ATen operators, ONNX Operators and ONNX Registry. Discussing them briefly helps public users that dont do ONNX exporter every day to build a mental map on how things fit together.

We can iterate on the proposed text, but defining what is an operator and registry before jumping to the 3 scenarios that we can implement operators is a good thing.

beginner_source/onnx/onnx_registry_tutorial.py

thiagocrepaldi · 2023-10-04T17:13:27Z

beginner_source/onnx/onnx_registry_tutorial.py

+# ATen operators are implemented by PyTorch, and the ONNX exporter team must manually implement the
+# conversion from ATen operators to ONNX operators through [ONNX Script](https://onnxscript.ai/). Although the ONNX exporter
+# team has been making their best efforts to support as many ATen operators as possible, some ATen
+# operators are still not supported. In this section, we will demonstrate how you can implement any
+# unsupported ATen operators, which can contribute back to the project through the PyTorch GitHub repo.


replied above too

thiagocrepaldi · 2023-10-04T17:32:07Z

beginner_source/onnx/onnx_registry_tutorial.py

+# If the model cannot be exported to ONNX, for instance, :class:`aten::add.Tensor` is not supported
+# by ONNX The error message can be found, and is as follows (for example,  ``aten::add.Tensor``):
+#    ``RuntimeErrorWithDiagnostic: Unsupported FX nodes: {'call_function': ['aten.add.Tensor']}. ``


I do agree 100% with you that we should not miss an aten operator on purpose just for the sake of the tutorial. I would never suggest that :)

In fact, I am Ok in keeping the current approach if we cannot find another way that frees user from hacking the registry themselves. In that case, I would create a new section/snippet block just for the un-registration part and make it part of the tutorial. That would 1) teach how to remove an operator from registry and 2) create a "physical" separation between the "adding" new operator from the "removing" the operator in the tutorial.

Nonetheless, I would like to inquire about options that make things simpler for the user.

For example, can we leverage the matching logic to register an equivalent operator with different names(overloads)? Maybe leveraging the default overload or another overload the . Can we try to register a aten::add.Tensor.default that we don't implement today because of our onnx-fx interpreter can match aten::add.Tensor.default to aten::add.Tensor that already exists?

Maybe something similar to what is being discussed at pytorch/pytorch#109966 in which aten::split(tensor,dim) already exists but a aten::split(tensor, dim, drop_remainder=False) is added with the same behavior? #109966 could be improved to make this kind of scenario possible, if not already

beginner_source/onnx/README.txt

Update advanced_source/super_resolution_with_onnxruntime.py Small change to kick off the build Update advanced_source/super_resolution_with_onnxruntime.py Small fix, to kick off the build. Update beginner_source/onnx/export_simple_model_to_onnx_tutorial.py Small change Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/export_simple_model_to_onnx_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> added comments

titaiwangms · 2023-10-05T03:44:58Z

Please resolve the merge conflicts

@svekars Thanks, it's ready to go!

titaiwangms · 2023-10-05T03:45:45Z

@thiagocrepaldi

We can have more discussion and a follow up PR on exposing "ONNX varaints" when I am back.

svekars · 2023-10-05T15:17:42Z

Closing this issue as the author is out of office. A new PR will be created and preserve the original ownership.

facebook-github-bot added the cla signed label Oct 3, 2023

titaiwangms force-pushed the titaiwang/add-onnx-registry-tutorial branch from 020e25f to 88549bd Compare October 3, 2023 02:29

svekars added the 2.1 label Oct 3, 2023

thiagocrepaldi reviewed Oct 3, 2023

View reviewed changes

titaiwangms force-pushed the titaiwang/add-onnx-registry-tutorial branch from 88549bd to 0a7d773 Compare October 4, 2023 09:25

thiagocrepaldi reviewed Oct 4, 2023

View reviewed changes

BowenBao reviewed Oct 4, 2023

View reviewed changes

beginner_source/onnx/README.txt Outdated Show resolved Hide resolved

Thiago Crepaldi and others added 2 commits October 5, 2023 00:16

Add ONNX tutorial using torch.onnx.dynamo_export API

c60e93e

titaiwangms force-pushed the titaiwang/add-onnx-registry-tutorial branch from 0a7d773 to bcd476e Compare October 5, 2023 00:58

titaiwangms changed the title ~~[Reland][ONNX 3] Add ONNX registry tutorial~~ Add ONNX registry tutorial Oct 5, 2023

svekars closed this Oct 5, 2023

thiagocrepaldi mentioned this pull request Oct 5, 2023

[ONNX 3] Add ONNX registry tutorial #2595

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ONNX registry tutorial #2578

Add ONNX registry tutorial #2578

titaiwangms commented Oct 3, 2023 •

edited

Loading

pytorch-bot bot commented Oct 3, 2023 •

edited

Loading

thiagocrepaldi Oct 3, 2023

titaiwangms Oct 4, 2023 •

edited

Loading

thiagocrepaldi Oct 4, 2023

titaiwangms Oct 5, 2023

thiagocrepaldi Oct 3, 2023

thiagocrepaldi Oct 3, 2023

titaiwangms Oct 4, 2023

thiagocrepaldi Oct 4, 2023

titaiwangms Oct 5, 2023

svekars commented Oct 4, 2023

thiagocrepaldi Oct 4, 2023

thiagocrepaldi Oct 4, 2023

thiagocrepaldi Oct 4, 2023

thiagocrepaldi Oct 4, 2023

thiagocrepaldi Oct 4, 2023

titaiwangms commented Oct 5, 2023

titaiwangms commented Oct 5, 2023

svekars commented Oct 5, 2023

-# If the model cannot be exported to ONNX, for instance, :class:`aten::add.Tensor` is not supported
-# by ONNX The error message can be found, and is as follows (for example,  ``aten::add.Tensor``):
-#    ``RuntimeErrorWithDiagnostic: Unsupported FX nodes: {'call_function': ['aten.add.Tensor']}. ``
+# In this section, we will assume that `aten::add.Tensor` is not supported by the ONNX registry, and we will demonstrate how to support it.
+# When a model cannot be exported to ONNX due to a missing operator, the ONNX exporter will show an error message with something similar to:
+#
+# .. code-block:: python
+#    RuntimeErrorWithDiagnostic: Unsupported FX nodes: {'call_function': ['aten.add.Tensor']}.

Add ONNX registry tutorial #2578

Add ONNX registry tutorial #2578

Conversation

titaiwangms commented Oct 3, 2023 • edited Loading

pytorch-bot bot commented Oct 3, 2023 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2578

Choose a reason for hiding this comment

titaiwangms Oct 4, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

svekars commented Oct 4, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

titaiwangms commented Oct 5, 2023

titaiwangms commented Oct 5, 2023

svekars commented Oct 5, 2023

titaiwangms commented Oct 3, 2023 •

edited

Loading

pytorch-bot bot commented Oct 3, 2023 •

edited

Loading

titaiwangms Oct 4, 2023 •

edited

Loading