Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tracking] Eager execution support. #105

Open
5 of 19 tasks
raikonenfnu opened this issue Oct 24, 2023 · 1 comment
Open
5 of 19 tasks

[tracking] Eager execution support. #105

raikonenfnu opened this issue Oct 24, 2023 · 1 comment
Assignees
Labels
tracking-issue Tracking Issue
Milestone

Comments

@raikonenfnu
Copy link
Member

raikonenfnu commented Oct 24, 2023

Pytorch v1.0 EagerMode

A small background from Pytorch's site on Eager Mode vs Graph/JIT/FX mode:

"PyTorch supports two execution modes [1]: eager mode and graph mode. In eager mode, operators in a model are immediately >executed as they are encountered. In contrast, in graph mode, operators are first synthesized into a graph, which will then be >compiled and executed as a whole. Eager mode is easier to use, more suitable for ML researchers, and hence is the default mode >of execution. On the other hand, graph mode typically delivers higher performance and hence is heavily used in production."

We'd like to introduce Pytorch v1.0's eager mode support on Turbine. To do such we'd need these features/tasks below:

  • Base device and tensor.py to intercept torch_function (d323a81)
  • Plumb through e2e compiler pipeline for computation torch_dispatch and torch_function (torch -> torch.fx -> mlir) with per session kernel caching. (validating here) - merged
  • Setup Eager specific executable (validating here) -merged
  • Refactoring to generate new DeviceTensor with existing device buffer to avoid moving buffer back to host to generate new DeviceTensor (validating here) depends on merged.
  • Refactor EagerExecutable and compute_method(compiler pipeline + execution) pipeline to run with async-exec execution model. (validating here) - Merged.
  • Instead of the current API of "create device based on flags" we need an API that is "create device with these kwargs". (e.g for specifying task_topology_max_group_count)
  • Add support to GPU devices
  • (:muscle:) Local kernel cache
  • (:muscle:) Plumb through/get autograd working with EagerMode.
  • (:muscle:) Support jit.script or jit.script like thing to be able to use less dispatches. (jit script source code)
  • (:muscle:) Refactor e2e compiler pipeline to use torch.compile().
  • (:muscle:) Add more substantial model examples and operator support.
  • (:muscle:) Add more substantial model examples and operator support.
  • (:muscle:) Add support for ops with mutliple output when we specified dims (i.e torch.max(t1, dim=1) or torch.topk(t1, dim =1), regular torch.max and torch.topk should work out of the box. With dims we should see this (error message)

💪 = help wanted

@stellaraccident stellaraccident added this to the Path to V1 milestone Oct 31, 2023
@stellaraccident stellaraccident changed the title Pytorch v1.0 EagerMode support. Eager execution support. Oct 31, 2023
@stellaraccident stellaraccident changed the title Eager execution support. [tracking] Eager execution support. Oct 31, 2023
@stellaraccident stellaraccident added the tracking-issue Tracking Issue label Oct 31, 2023
@vivekkhandelwal1
Copy link
Contributor

Hi @raikonenfnu, are we still working on this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tracking-issue Tracking Issue
Projects
None yet
Development

No branches or pull requests

3 participants