Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lack of clarity regarding external backends #4891

Open
tomeuv opened this issue Aug 24, 2024 · 6 comments
Open

Lack of clarity regarding external backends #4891

tomeuv opened this issue Aug 24, 2024 · 6 comments
Assignees
Labels
module: doc Related to our documentation, both in docs/ and docblocks

Comments

@tomeuv
Copy link

tomeuv commented Aug 24, 2024

I'm not able to find any references to backends external to PyTorch/Executorch. Are there any plans to provide a stable delegate API similar to that of TensorFlow Lite? In any case, I think it would be good to have the current consensus on this topic in the documentation. See https://www.tensorflow.org/lite/performance/implementing_delegate#option_2_leverage_external_delegate

For context, I'm the author of two open-source NPU drivers, and I would love to support Executorch in addition to TensorFLow Lite. See https://blog.tomeuvizoso.net/ for the details.

@dvorjackz
Copy link
Contributor

cc @cccclai

@larryliu0820
Copy link
Contributor

Hi @tomeuv thanks for reaching out and it is great to see you are willing to onboard new delegates! Regarding your question, we have an example delegate here: https://github.com/pytorch/executorch/tree/main/backends/example, not sure if you have seen it? Please let us know how we can improve the doc/example code!

@tomeuv
Copy link
Author

tomeuv commented Aug 26, 2024

@larryliu0820 Would that allow for backends that executorch can dynamically load at runtime, like TensorFlow Lite can?

Right now it seems as if Executorch requires that all backends are known at build time (and their source code hosted as part of Executorch).

@dbort
Copy link
Contributor

dbort commented Aug 26, 2024

Delegates do not need to live in the core repo, and they can be loaded dynamically. The two basic runtime requirements are:

  • Implement a subclass of PyTorchBackendInterface (example)
  • Before loading a Method, register an instance of that backend in the backend registry with register_backed (example)

@cccclai
Copy link
Contributor

cccclai commented Aug 27, 2024

Thank you for reaching out and glad that you're interested in bringing up new NPU drivers! I just read the blog and looks like they're mostly for Rockchip SoC NPU and the SoC is mainly targeting of CV tasks? Following is the answer to the question and please let us know if you have any further question :)

The delegate API is fairly stable and we've been working a few partners to leverage more backends. For external delegate, it doesn't need to live in the core repro, Like @dbort mentioned, we just need to create the two components, one for ahead of time and one for runtime, and it's good enough to use. And the code pointer provided by @larryliu0820 is a good example.

More detailed documentation can be found in https://pytorch.org/executorch/stable/compiler-delegate-and-partitioner.html and we have a list of existing delegates (except cadence) as reference example

@iseeyuan iseeyuan added the module: doc Related to our documentation, both in docs/ and docblocks label Aug 30, 2024
@tomeuv
Copy link
Author

tomeuv commented Sep 16, 2024

Thank you for reaching out and glad that you're interested in bringing up new NPU drivers! I just read the blog and looks like they're mostly for Rockchip SoC NPU and the SoC is mainly targeting of CV tasks?

I'm seeing more commercial interest on the VeriSilicon driver (specially, as integrated in the i.MX8MP SoC). Both drivers are at the same level of functionality, but the Rockchip one is a bit behind in terms of upstreaming as it needs a new kernel driver that is still undergoing review.

I think it's fair to say that these NPUs have been mostly used for computer vision, but I'm seeing quite a few people making use of the Rockchip one for LLMs. This is a usecase I will be working on in the future.

I have plans for adding support for other NPU IP.

Following is the answer to the question and please let us know if you have any further question :)

The delegate API is fairly stable and we've been working a few partners to leverage more backends.

Can you please extend on what the API stability guarantees are? I would be wary of implementing support for an unstable API, as users experience mismatches as bugs and I won't able to support all the possible combinations of API versions.

To be clear, it's fine if the API sees occasional incompatible changes, as long as there is a versioning mechanism.

For external delegate, it doesn't need to live in the core repro, Like @dbort mentioned, we just need to create the two components, one for ahead of time and one for runtime, and it's good enough to use. And the code pointer provided by @larryliu0820 is a good example.

More detailed documentation can be found in https://pytorch.org/executorch/stable/compiler-delegate-and-partitioner.html and we have a list of existing delegates (except cadence) as reference example

I think this is enough information to get me started. Thanks to all!

Feel free to close this one, and I will enter more specific ones if I see any opportunities for improvement of the documentation as I investigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: doc Related to our documentation, both in docs/ and docblocks
Projects
None yet
Development

No branches or pull requests

6 participants