Lack of clarity regarding external backends #4891

tomeuv · 2024-08-24T17:55:06Z

I'm not able to find any references to backends external to PyTorch/Executorch. Are there any plans to provide a stable delegate API similar to that of TensorFlow Lite? In any case, I think it would be good to have the current consensus on this topic in the documentation. See https://www.tensorflow.org/lite/performance/implementing_delegate#option_2_leverage_external_delegate

For context, I'm the author of two open-source NPU drivers, and I would love to support Executorch in addition to TensorFLow Lite. See https://blog.tomeuvizoso.net/ for the details.

dvorjackz · 2024-08-26T17:32:39Z

cc @cccclai

larryliu0820 · 2024-08-26T17:34:27Z

Hi @tomeuv thanks for reaching out and it is great to see you are willing to onboard new delegates! Regarding your question, we have an example delegate here: https://github.com/pytorch/executorch/tree/main/backends/example, not sure if you have seen it? Please let us know how we can improve the doc/example code!

tomeuv · 2024-08-26T19:20:36Z

@larryliu0820 Would that allow for backends that executorch can dynamically load at runtime, like TensorFlow Lite can?

Right now it seems as if Executorch requires that all backends are known at build time (and their source code hosted as part of Executorch).

dbort · 2024-08-26T20:37:35Z

Delegates do not need to live in the core repo, and they can be loaded dynamically. The two basic runtime requirements are:

Implement a subclass of PyTorchBackendInterface (example)
Before loading a Method, register an instance of that backend in the backend registry with register_backed (example)

cccclai · 2024-08-27T03:33:15Z

Thank you for reaching out and glad that you're interested in bringing up new NPU drivers! I just read the blog and looks like they're mostly for Rockchip SoC NPU and the SoC is mainly targeting of CV tasks? Following is the answer to the question and please let us know if you have any further question :)

The delegate API is fairly stable and we've been working a few partners to leverage more backends. For external delegate, it doesn't need to live in the core repro, Like @dbort mentioned, we just need to create the two components, one for ahead of time and one for runtime, and it's good enough to use. And the code pointer provided by @larryliu0820 is a good example.

More detailed documentation can be found in https://pytorch.org/executorch/stable/compiler-delegate-and-partitioner.html and we have a list of existing delegates (except cadence) as reference example

tomeuv · 2024-09-16T13:14:43Z

Thank you for reaching out and glad that you're interested in bringing up new NPU drivers! I just read the blog and looks like they're mostly for Rockchip SoC NPU and the SoC is mainly targeting of CV tasks?

I'm seeing more commercial interest on the VeriSilicon driver (specially, as integrated in the i.MX8MP SoC). Both drivers are at the same level of functionality, but the Rockchip one is a bit behind in terms of upstreaming as it needs a new kernel driver that is still undergoing review.

I think it's fair to say that these NPUs have been mostly used for computer vision, but I'm seeing quite a few people making use of the Rockchip one for LLMs. This is a usecase I will be working on in the future.

I have plans for adding support for other NPU IP.

Following is the answer to the question and please let us know if you have any further question :)

The delegate API is fairly stable and we've been working a few partners to leverage more backends.

Can you please extend on what the API stability guarantees are? I would be wary of implementing support for an unstable API, as users experience mismatches as bugs and I won't able to support all the possible combinations of API versions.

To be clear, it's fine if the API sees occasional incompatible changes, as long as there is a versioning mechanism.

For external delegate, it doesn't need to live in the core repro, Like @dbort mentioned, we just need to create the two components, one for ahead of time and one for runtime, and it's good enough to use. And the code pointer provided by @larryliu0820 is a good example.

More detailed documentation can be found in https://pytorch.org/executorch/stable/compiler-delegate-and-partitioner.html and we have a list of existing delegates (except cadence) as reference example

I think this is enough information to get me started. Thanks to all!

Feel free to close this one, and I will enter more specific ones if I see any opportunities for improvement of the documentation as I investigate.

dvorjackz assigned cccclai Aug 26, 2024

iseeyuan added the module: doc Related to our documentation, both in docs/ and docblocks label Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lack of clarity regarding external backends #4891

Lack of clarity regarding external backends #4891

tomeuv commented Aug 24, 2024 •

edited

Loading

dvorjackz commented Aug 26, 2024

larryliu0820 commented Aug 26, 2024

tomeuv commented Aug 26, 2024

dbort commented Aug 26, 2024

cccclai commented Aug 27, 2024 •

edited

Loading

tomeuv commented Sep 16, 2024

Lack of clarity regarding external backends #4891

Lack of clarity regarding external backends #4891

Comments

tomeuv commented Aug 24, 2024 • edited Loading

dvorjackz commented Aug 26, 2024

larryliu0820 commented Aug 26, 2024

tomeuv commented Aug 26, 2024

dbort commented Aug 26, 2024

cccclai commented Aug 27, 2024 • edited Loading

tomeuv commented Sep 16, 2024

tomeuv commented Aug 24, 2024 •

edited

Loading

cccclai commented Aug 27, 2024 •

edited

Loading