Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ONNX registry tutorial #2578

Conversation

titaiwangms
Copy link
Contributor

@titaiwangms titaiwangms commented Oct 3, 2023

Reland #2562

Follow up #2541, this PR adds step by step guide to demonstrate end to end solution of how to address unsupported ATen/ONNX operators issue. Please review the last commit.

NOTE: To avoid merge conflicts with the previous two PRs (ONNX 1 and ONNX 2), this PR builds on top of them like ghstack. You can review the last commit.

cc @thiagocrepaldi @abock @justinchuby @BowenBao @wschin

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 3, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2578

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Comment on lines 45 to 50
.. note::
This tutorial leverages `onnxscript <https://github.com/microsoft/onnxscript#readme>`__
to create custom ONNX operators. onnxscript is a Python library that allows users to
create custom ONNX operators in Python. It is a prerequisite learning material for
this tutorial. Please make sure you have read the onnxscript tutorial before proceeding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only relevant to the ONNX registry tutorial and probably should be moved there.

Users that don't depend on custom operators don't need to sped extra time learning ONNX Script to export models with default operators. This adds unnecessary extra requisites to something we want to be as simple as possible for the users

Copy link
Contributor Author

@titaiwangms titaiwangms Oct 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was requested by you in the original one. I am not sure why this is reviewed again. I think it's important to point out where symbolic functions are from in dynamo export, and it's a note with specifying custom operators. I like the idea putting it in this page.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for any miscommunication. Code-review is an iterative process, and sometimes reviewers have a better view of the final product on a second or third round.

This series of tutorials is designed to present content in increasing levels complexity, focused on a single task per tutorial. The first of the series only tackles dependency installation and presents a table of content to the other tutorials with specific topics of their own.

However, this note is incorrectly stating ONNX Script is a prerequisite learning material for this tutorial, which is only true for the Introduction to ONNX Registry tutorial. Conceptually, ONNX Script is not needed for Introduction to ONNX nor Export a PyTorch model to ONNX tutorials. The new ONNX exporter API might be a lot to tackle for new users, adding extra reading is an entry point barrier for beginners.

The note also dumps references to advanced concepts such as "custom operator" before users have a chance to get acquainted with the basic export API or even viewing how a simple model with standard ops look like.

Please move this piece to tutorial 3.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

beginner_source/onnx/intro_onnx.py Show resolved Hide resolved
beginner_source/onnx/onnx_registry_tutorial.py Outdated Show resolved Hide resolved
beginner_source/onnx/intro_onnx.py Outdated Show resolved Hide resolved
Comment on lines +56 to +68
# If the model cannot be exported to ONNX, for instance, :class:`aten::add.Tensor` is not supported
# by ONNX The error message can be found, and is as follows (for example, ``aten::add.Tensor``):
# ``RuntimeErrorWithDiagnostic: Unsupported FX nodes: {'call_function': ['aten.add.Tensor']}. ``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# If the model cannot be exported to ONNX, for instance, :class:`aten::add.Tensor` is not supported
# by ONNX The error message can be found, and is as follows (for example, ``aten::add.Tensor``):
# ``RuntimeErrorWithDiagnostic: Unsupported FX nodes: {'call_function': ['aten.add.Tensor']}. ``
# In this section, we will assume that `aten::add.Tensor` is not supported by the ONNX registry, and we will demonstrate how to support it.
# When a model cannot be exported to ONNX due to a missing operator, the ONNX exporter will show an error message with something similar to:
#
# .. code-block:: python
# RuntimeErrorWithDiagnostic: Unsupported FX nodes: {'call_function': ['aten.add.Tensor']}.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is focussed on the adding missing aten operators, but the example actually replaces an existing one, which is the topic of the next section.

I think we should add support to a truly missing operator? Maybe a variant of an existing operator that we don't really care about in having in the official registry?

I am trying to prevent confusion between the 3 scenarios. The missing operator section is using a trick of replacing existing operator, which beginners might find confusing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point. I tried "removing add" and it's also weird. I guess the idea of leaving a missing operator for tutorial purpose is contradict to converter team goal. So here, instead, I do my best to emphasize that we are adding support to existing operator, and pretend it wasn't there. Honestly, I guess if there was one in purpose missing operator, users might start contributing/asking about it.

The difference between three should not be affected by this though.

  1. Missing ATen can be addressed with existing ONNX ops.
  2. Changed ops from ONNX namesapce to com.microsoft
  3. Support empty custom ops (using ORT as example backend)

We want to advertise the first one, since it's an advantage that onnxscript brings to us, also more descriptive and easy to understand as an example, but the third one has the most existing users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree 100% with you that we should not miss an aten operator on purpose just for the sake of the tutorial. I would never suggest that :)

In fact, I am Ok in keeping the current approach if we cannot find another way that frees user from hacking the registry themselves. In that case, I would create a new section/snippet block just for the un-registration part and make it part of the tutorial. That would 1) teach how to remove an operator from registry and 2) create a "physical" separation between the "adding" new operator from the "removing" the operator in the tutorial.

Nonetheless, I would like to inquire about options that make things simpler for the user.

For example, can we leverage the matching logic to register an equivalent operator with different names(overloads)? Maybe leveraging the default overload or another overload the . Can we try to register a aten::add.Tensor.default that we don't implement today because of our onnx-fx interpreter can match aten::add.Tensor.default to aten::add.Tensor that already exists?

Maybe something similar to what is being discussed at pytorch/pytorch#109966 in which aten::split(tensor,dim) already exists but a aten::split(tensor, dim, drop_remainder=False) is added with the same behavior? #109966 could be improved to make this kind of scenario possible, if not already

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, can we leverage the matching logic to register an equivalent operator with different names(overloads)? Maybe leveraging the default overload or another overload the . Can we try to register a aten::add.Tensor.default that we don't implement today because of our onnx-fx interpreter can match aten::add.Tensor.default to aten::add.Tensor that already exists?

In this approach, we would have to create that ATen overload ahead in torch side to demonstrate it, which is over complicated the three examples here. Otherwise, we never get the overload from FX.

Maybe something similar to what is being discussed at pytorch/pytorch#109966 in which aten::split(tensor,dim) already exists but a aten::split(tensor, dim, drop_remainder=False) is added with the same behavior? #109966 could be improved to make this kind of scenario possible, if not already

In that case, user will not be able to overwrite, because schema doesn't match. You have to remove the overload first, and register it with the new schema. Schema is a primary key to the whole matching, kwargs is part of them.

Overall, I think this tutorial is full of new things, and we should expose overload usages later. Maybe after having some feedback on registry, and then we can go from there.

beginner_source/onnx/onnx_registry_tutorial.py Outdated Show resolved Hide resolved
beginner_source/onnx/onnx_registry_tutorial.py Outdated Show resolved Hide resolved
beginner_source/onnx/onnx_registry_tutorial.py Outdated Show resolved Hide resolved
@titaiwangms titaiwangms force-pushed the titaiwang/add-onnx-registry-tutorial branch from 88549bd to 0a7d773 Compare October 4, 2023 09:25
@svekars
Copy link
Contributor

svekars commented Oct 4, 2023

Please resolve the merge conflicts

Comment on lines 45 to 50
.. note::
This tutorial leverages `onnxscript <https://github.com/microsoft/onnxscript#readme>`__
to create custom ONNX operators. onnxscript is a Python library that allows users to
create custom ONNX operators in Python. It is a prerequisite learning material for
this tutorial. Please make sure you have read the onnxscript tutorial before proceeding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I apologize for any miscommunication. Code-review is an iterative process, and sometimes reviewers have a better view of the final product on a second or third round.

This series of tutorials is designed to present content in increasing levels complexity, focused on a single task per tutorial. The first of the series only tackles dependency installation and presents a table of content to the other tutorials with specific topics of their own.

However, this note is incorrectly stating ONNX Script is a prerequisite learning material for this tutorial, which is only true for the Introduction to ONNX Registry tutorial. Conceptually, ONNX Script is not needed for Introduction to ONNX nor Export a PyTorch model to ONNX tutorials. The new ONNX exporter API might be a lot to tackle for new users, adding extra reading is an entry point barrier for beginners.

The note also dumps references to advanced concepts such as "custom operator" before users have a chance to get acquainted with the basic export API or even viewing how a simple model with standard ops look like.

Please move this piece to tutorial 3.

beginner_source/onnx/intro_onnx.py Outdated Show resolved Hide resolved
beginner_source/onnx/intro_onnx.py Outdated Show resolved Hide resolved
Comment on lines +31 to +55
import torch
print(torch.__version__)
torch.manual_seed(191009) # set the seed for reproducibility

import onnxscript # pip install onnxscript
print(onnxscript.__version__)

# NOTE: opset18 is the only version of ONNX operators we are
# using in torch.onnx.dynamo_export for now.
from onnxscript import opset18

import onnxruntime # pip install onnxruntime
print(onnxruntime.__version__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, users will never start the tutorial on ONNX Registry. I've made sure of this by creating the Backends menu with Introduction to ONNX as the landing page as shown below

image

The users are forced to go through intro_onnx.py and then click on the Introduction to ONNX Registry, if they want.

I am OK in adding a Verifying the installation at intro_onnx.py right after Dependencies with this snippet. We can do it in a separate PR, in case you prefer, but this section does not belong to onnx_registry_tutorial.py and we should remove it

Comment on lines +19 to +21
# This tutorial is an introduction to ONNX registry, which
# empowers users to create their own ONNX registries enabling
# them to address unsupported operators in ONNX.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like part how the tutorial express what is the ultimate goal. This is pretty clear. In fact, most of the text is great in showing what is next.

My suggestion is to smooth out the transition from a happy path scenario ("I exported a model that didnt anything extra") presented on tutorial 2 to advanced scenario requiring knowledge of operator implementation that is needed by tutorial 3.

So I proposed a modified version of the original text from the "Unsupported ATen operators" section to serve as an additional introduction to ATen operator (maintained by pytorch core), and how ONNX operators (maintained by onnx exporter) interacts with the ONNX registry.

This is an attempt to build increasing levels of information before we get to coding. We have at least 3 concepts here. ATen operators, ONNX Operators and ONNX Registry. Discussing them briefly helps public users that dont do ONNX exporter every day to build a mental map on how things fit together.

We can iterate on the proposed text, but defining what is an operator and registry before jumping to the 3 scenarios that we can implement operators is a good thing.

beginner_source/onnx/onnx_registry_tutorial.py Outdated Show resolved Hide resolved
Comment on lines 50 to 54
# ATen operators are implemented by PyTorch, and the ONNX exporter team must manually implement the
# conversion from ATen operators to ONNX operators through [ONNX Script](https://onnxscript.ai/). Although the ONNX exporter
# team has been making their best efforts to support as many ATen operators as possible, some ATen
# operators are still not supported. In this section, we will demonstrate how you can implement any
# unsupported ATen operators, which can contribute back to the project through the PyTorch GitHub repo.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replied above too

Comment on lines +56 to +68
# If the model cannot be exported to ONNX, for instance, :class:`aten::add.Tensor` is not supported
# by ONNX The error message can be found, and is as follows (for example, ``aten::add.Tensor``):
# ``RuntimeErrorWithDiagnostic: Unsupported FX nodes: {'call_function': ['aten.add.Tensor']}. ``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree 100% with you that we should not miss an aten operator on purpose just for the sake of the tutorial. I would never suggest that :)

In fact, I am Ok in keeping the current approach if we cannot find another way that frees user from hacking the registry themselves. In that case, I would create a new section/snippet block just for the un-registration part and make it part of the tutorial. That would 1) teach how to remove an operator from registry and 2) create a "physical" separation between the "adding" new operator from the "removing" the operator in the tutorial.

Nonetheless, I would like to inquire about options that make things simpler for the user.

For example, can we leverage the matching logic to register an equivalent operator with different names(overloads)? Maybe leveraging the default overload or another overload the . Can we try to register a aten::add.Tensor.default that we don't implement today because of our onnx-fx interpreter can match aten::add.Tensor.default to aten::add.Tensor that already exists?

Maybe something similar to what is being discussed at pytorch/pytorch#109966 in which aten::split(tensor,dim) already exists but a aten::split(tensor, dim, drop_remainder=False) is added with the same behavior? #109966 could be improved to make this kind of scenario possible, if not already

Thiago Crepaldi and others added 2 commits October 5, 2023 00:16
Update advanced_source/super_resolution_with_onnxruntime.py

Small change to kick off the build

Update advanced_source/super_resolution_with_onnxruntime.py

Small fix, to kick off the build.

Update beginner_source/onnx/export_simple_model_to_onnx_tutorial.py

Small change

Update beginner_source/onnx/onnx_registry_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

Update beginner_source/onnx/onnx_registry_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

Update beginner_source/onnx/onnx_registry_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

Update beginner_source/onnx/onnx_registry_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

Update beginner_source/onnx/onnx_registry_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

Update beginner_source/onnx/onnx_registry_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

Update beginner_source/onnx/onnx_registry_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

Update beginner_source/onnx/export_simple_model_to_onnx_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

Update beginner_source/onnx/onnx_registry_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

Update beginner_source/onnx/onnx_registry_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

Update beginner_source/onnx/onnx_registry_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

Update beginner_source/onnx/onnx_registry_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

Update beginner_source/onnx/onnx_registry_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

Update beginner_source/onnx/onnx_registry_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

Update beginner_source/onnx/onnx_registry_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

Update beginner_source/onnx/onnx_registry_tutorial.py

Co-authored-by: Svetlana Karslioglu <[email protected]>

added comments
@titaiwangms titaiwangms force-pushed the titaiwang/add-onnx-registry-tutorial branch from 0a7d773 to bcd476e Compare October 5, 2023 00:58
@titaiwangms
Copy link
Contributor Author

Please resolve the merge conflicts

@svekars Thanks, it's ready to go!

@titaiwangms
Copy link
Contributor Author

@thiagocrepaldi

We can have more discussion and a follow up PR on exposing "ONNX varaints" when I am back.

@titaiwangms titaiwangms changed the title [Reland][ONNX 3] Add ONNX registry tutorial Add ONNX registry tutorial Oct 5, 2023
@svekars
Copy link
Contributor

svekars commented Oct 5, 2023

Closing this issue as the author is out of office. A new PR will be created and preserve the original ownership.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants