-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ONNX registry tutorial #2578
Add ONNX registry tutorial #2578
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2578
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
020e25f
to
88549bd
Compare
beginner_source/onnx/intro_onnx.py
Outdated
.. note:: | ||
This tutorial leverages `onnxscript <https://github.com/microsoft/onnxscript#readme>`__ | ||
to create custom ONNX operators. onnxscript is a Python library that allows users to | ||
create custom ONNX operators in Python. It is a prerequisite learning material for | ||
this tutorial. Please make sure you have read the onnxscript tutorial before proceeding. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only relevant to the ONNX registry tutorial and probably should be moved there.
Users that don't depend on custom operators don't need to sped extra time learning ONNX Script to export models with default operators. This adds unnecessary extra requisites to something we want to be as simple as possible for the users
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was requested by you in the original one. I am not sure why this is reviewed again. I think it's important to point out where symbolic functions are from in dynamo export, and it's a note with specifying custom operators. I like the idea putting it in this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I apologize for any miscommunication. Code-review is an iterative process, and sometimes reviewers have a better view of the final product on a second or third round.
This series of tutorials is designed to present content in increasing levels complexity, focused on a single task per tutorial. The first of the series only tackles dependency installation and presents a table of content to the other tutorials with specific topics of their own.
However, this note is incorrectly stating ONNX Script is a prerequisite learning material for this tutorial
, which is only true for the Introduction to ONNX Registry
tutorial. Conceptually, ONNX Script is not needed for Introduction to ONNX
nor Export a PyTorch model to ONNX
tutorials. The new ONNX exporter API might be a lot to tackle for new users, adding extra reading is an entry point barrier for beginners.
The note also dumps references to advanced concepts such as "custom operator" before users have a chance to get acquainted with the basic export API or even viewing how a simple model with standard ops look like.
Please move this piece to tutorial 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
# If the model cannot be exported to ONNX, for instance, :class:`aten::add.Tensor` is not supported | ||
# by ONNX The error message can be found, and is as follows (for example, ``aten::add.Tensor``): | ||
# ``RuntimeErrorWithDiagnostic: Unsupported FX nodes: {'call_function': ['aten.add.Tensor']}. `` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# If the model cannot be exported to ONNX, for instance, :class:`aten::add.Tensor` is not supported | |
# by ONNX The error message can be found, and is as follows (for example, ``aten::add.Tensor``): | |
# ``RuntimeErrorWithDiagnostic: Unsupported FX nodes: {'call_function': ['aten.add.Tensor']}. `` | |
# In this section, we will assume that `aten::add.Tensor` is not supported by the ONNX registry, and we will demonstrate how to support it. | |
# When a model cannot be exported to ONNX due to a missing operator, the ONNX exporter will show an error message with something similar to: | |
# | |
# .. code-block:: python | |
# RuntimeErrorWithDiagnostic: Unsupported FX nodes: {'call_function': ['aten.add.Tensor']}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is focussed on the adding missing aten operators, but the example actually replaces an existing one, which is the topic of the next section.
I think we should add support to a truly missing operator? Maybe a variant of an existing operator that we don't really care about in having in the official registry?
I am trying to prevent confusion between the 3 scenarios. The missing operator section is using a trick of replacing existing operator, which beginners might find confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point. I tried "removing add" and it's also weird. I guess the idea of leaving a missing operator for tutorial purpose is contradict to converter team goal. So here, instead, I do my best to emphasize that we are adding support to existing operator, and pretend it wasn't there. Honestly, I guess if there was one in purpose missing operator, users might start contributing/asking about it.
The difference between three should not be affected by this though.
- Missing ATen can be addressed with existing ONNX ops.
- Changed ops from ONNX namesapce to com.microsoft
- Support empty custom ops (using ORT as example backend)
We want to advertise the first one, since it's an advantage that onnxscript brings to us, also more descriptive and easy to understand as an example, but the third one has the most existing users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do agree 100% with you that we should not miss an aten operator on purpose just for the sake of the tutorial. I would never suggest that :)
In fact, I am Ok in keeping the current approach if we cannot find another way that frees user from hacking the registry themselves. In that case, I would create a new section/snippet block just for the un-registration part and make it part of the tutorial. That would 1) teach how to remove an operator from registry and 2) create a "physical" separation between the "adding" new operator from the "removing" the operator in the tutorial.
Nonetheless, I would like to inquire about options that make things simpler for the user.
For example, can we leverage the matching logic to register an equivalent operator with different names(overloads)? Maybe leveraging the default
overload or another overload the . Can we try to register a aten::add.Tensor.default
that we don't implement today because of our onnx-fx interpreter can match aten::add.Tensor.default
to aten::add.Tensor
that already exists?
Maybe something similar to what is being discussed at pytorch/pytorch#109966 in which aten::split(tensor,dim)
already exists but a aten::split(tensor, dim, drop_remainder=False)
is added with the same behavior? #109966 could be improved to make this kind of scenario possible, if not already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, can we leverage the matching logic to register an equivalent operator with different names(overloads)? Maybe leveraging the default overload or another overload the . Can we try to register a aten::add.Tensor.default that we don't implement today because of our onnx-fx interpreter can match aten::add.Tensor.default to aten::add.Tensor that already exists?
In this approach, we would have to create that ATen overload ahead in torch side to demonstrate it, which is over complicated the three examples here. Otherwise, we never get the overload from FX.
Maybe something similar to what is being discussed at pytorch/pytorch#109966 in which aten::split(tensor,dim) already exists but a aten::split(tensor, dim, drop_remainder=False) is added with the same behavior? #109966 could be improved to make this kind of scenario possible, if not already
In that case, user will not be able to overwrite, because schema doesn't match. You have to remove the overload first, and register it with the new schema. Schema is a primary key to the whole matching, kwargs is part of them.
Overall, I think this tutorial is full of new things, and we should expose overload usages later. Maybe after having some feedback on registry, and then we can go from there.
88549bd
to
0a7d773
Compare
Please resolve the merge conflicts |
beginner_source/onnx/intro_onnx.py
Outdated
.. note:: | ||
This tutorial leverages `onnxscript <https://github.com/microsoft/onnxscript#readme>`__ | ||
to create custom ONNX operators. onnxscript is a Python library that allows users to | ||
create custom ONNX operators in Python. It is a prerequisite learning material for | ||
this tutorial. Please make sure you have read the onnxscript tutorial before proceeding. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I apologize for any miscommunication. Code-review is an iterative process, and sometimes reviewers have a better view of the final product on a second or third round.
This series of tutorials is designed to present content in increasing levels complexity, focused on a single task per tutorial. The first of the series only tackles dependency installation and presents a table of content to the other tutorials with specific topics of their own.
However, this note is incorrectly stating ONNX Script is a prerequisite learning material for this tutorial
, which is only true for the Introduction to ONNX Registry
tutorial. Conceptually, ONNX Script is not needed for Introduction to ONNX
nor Export a PyTorch model to ONNX
tutorials. The new ONNX exporter API might be a lot to tackle for new users, adding extra reading is an entry point barrier for beginners.
The note also dumps references to advanced concepts such as "custom operator" before users have a chance to get acquainted with the basic export API or even viewing how a simple model with standard ops look like.
Please move this piece to tutorial 3.
import torch | ||
print(torch.__version__) | ||
torch.manual_seed(191009) # set the seed for reproducibility | ||
|
||
import onnxscript # pip install onnxscript | ||
print(onnxscript.__version__) | ||
|
||
# NOTE: opset18 is the only version of ONNX operators we are | ||
# using in torch.onnx.dynamo_export for now. | ||
from onnxscript import opset18 | ||
|
||
import onnxruntime # pip install onnxruntime | ||
print(onnxruntime.__version__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, users will never start the tutorial on ONNX Registry. I've made sure of this by creating the Backends
menu with Introduction to ONNX
as the landing page as shown below
The users are forced to go through intro_onnx.py
and then click on the Introduction to ONNX Registry
, if they want.
I am OK in adding a Verifying the installation
at intro_onnx.py
right after Dependencies
with this snippet. We can do it in a separate PR, in case you prefer, but this section does not belong to onnx_registry_tutorial.py
and we should remove it
# This tutorial is an introduction to ONNX registry, which | ||
# empowers users to create their own ONNX registries enabling | ||
# them to address unsupported operators in ONNX. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do like part how the tutorial express what is the ultimate goal. This is pretty clear. In fact, most of the text is great in showing what is next.
My suggestion is to smooth out the transition from a happy path scenario ("I exported a model that didnt anything extra") presented on tutorial 2 to advanced scenario requiring knowledge of operator implementation that is needed by tutorial 3.
So I proposed a modified version of the original text from the "Unsupported ATen operators" section to serve as an additional introduction to ATen operator (maintained by pytorch core), and how ONNX operators (maintained by onnx exporter) interacts with the ONNX registry.
This is an attempt to build increasing levels of information before we get to coding. We have at least 3 concepts here. ATen operators, ONNX Operators and ONNX Registry. Discussing them briefly helps public users that dont do ONNX exporter every day to build a mental map on how things fit together.
We can iterate on the proposed text, but defining what is an operator and registry before jumping to the 3 scenarios that we can implement operators is a good thing.
# ATen operators are implemented by PyTorch, and the ONNX exporter team must manually implement the | ||
# conversion from ATen operators to ONNX operators through [ONNX Script](https://onnxscript.ai/). Although the ONNX exporter | ||
# team has been making their best efforts to support as many ATen operators as possible, some ATen | ||
# operators are still not supported. In this section, we will demonstrate how you can implement any | ||
# unsupported ATen operators, which can contribute back to the project through the PyTorch GitHub repo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replied above too
# If the model cannot be exported to ONNX, for instance, :class:`aten::add.Tensor` is not supported | ||
# by ONNX The error message can be found, and is as follows (for example, ``aten::add.Tensor``): | ||
# ``RuntimeErrorWithDiagnostic: Unsupported FX nodes: {'call_function': ['aten.add.Tensor']}. `` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do agree 100% with you that we should not miss an aten operator on purpose just for the sake of the tutorial. I would never suggest that :)
In fact, I am Ok in keeping the current approach if we cannot find another way that frees user from hacking the registry themselves. In that case, I would create a new section/snippet block just for the un-registration part and make it part of the tutorial. That would 1) teach how to remove an operator from registry and 2) create a "physical" separation between the "adding" new operator from the "removing" the operator in the tutorial.
Nonetheless, I would like to inquire about options that make things simpler for the user.
For example, can we leverage the matching logic to register an equivalent operator with different names(overloads)? Maybe leveraging the default
overload or another overload the . Can we try to register a aten::add.Tensor.default
that we don't implement today because of our onnx-fx interpreter can match aten::add.Tensor.default
to aten::add.Tensor
that already exists?
Maybe something similar to what is being discussed at pytorch/pytorch#109966 in which aten::split(tensor,dim)
already exists but a aten::split(tensor, dim, drop_remainder=False)
is added with the same behavior? #109966 could be improved to make this kind of scenario possible, if not already
Update advanced_source/super_resolution_with_onnxruntime.py Small change to kick off the build Update advanced_source/super_resolution_with_onnxruntime.py Small fix, to kick off the build. Update beginner_source/onnx/export_simple_model_to_onnx_tutorial.py Small change Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/export_simple_model_to_onnx_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> Update beginner_source/onnx/onnx_registry_tutorial.py Co-authored-by: Svetlana Karslioglu <[email protected]> added comments
0a7d773
to
bcd476e
Compare
@svekars Thanks, it's ready to go! |
We can have more discussion and a follow up PR on exposing "ONNX varaints" when I am back. |
Closing this issue as the author is out of office. A new PR will be created and preserve the original ownership. |
Reland #2562
Follow up #2541, this PR adds step by step guide to demonstrate end to end solution of how to address unsupported ATen/ONNX operators issue. Please review the last commit.
NOTE: To avoid merge conflicts with the previous two PRs (ONNX 1 and ONNX 2), this PR builds on top of them like ghstack. You can review the last commit.
cc @thiagocrepaldi @abock @justinchuby @BowenBao @wschin