Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tracker] WIP features for torchao 0.3 #252

Open
12 of 19 tasks
supriyar opened this issue May 17, 2024 · 6 comments
Open
12 of 19 tasks

[Tracker] WIP features for torchao 0.3 #252

supriyar opened this issue May 17, 2024 · 6 comments
Labels

Comments

@supriyar
Copy link
Contributor

supriyar commented May 17, 2024

Focus - benchmarking, documentation, tutorials, prototype to beta

Due date: June 13 2024

Spillover from 0.2.0

Benchmarking

Documentation

Tutorials

  • Tutorial for affine quantization dtype and unified quant primitives - Found lots of subtle differences, especially w.r.t. preserving zeros and tinygemm (@jerryzh168)

Core

  • QAT workflow (@andrewor14)
  • dedup the implementations of quant primitives (@jerryzh168)
  • dedup the implementations of quant APIs (@jerryzh168)
  • Deduplicate int4 workflows
  • Factory function ahd implements decorator for affine quantization dtype
  • Bit packing interfaces @msaroufim
  • float6 kernels @gau-nernst
  • int 3/5 kernel @msaroufim
@jeromeku
Copy link
Collaborator

jeromeku commented May 30, 2024

@msaroufim

  • Generally, what needs to be done to compose a new dtype with FSDP?
  • What other (high priority) dtypes are on the ao roadmap for integration with FSDP?
  • Is there a universal representation for asymmetric / symmetrically quantized types in torch? I.e., a subbyte/byte type with scale / zero that can be used regardless of the quantization method?
  • Is development of fp8 primitives for training and inference primarily in pytorch/float8_experimental or are there specific torchao initiatives focused on fp8?

Happy to contribute on any of these fronts.

@jerryzh168
Copy link
Contributor

@jeromeku for

  • Is there a universal representation for asymmetric / symmetrically quantized types in torch? I.e., a subbyte/byte type with scale / zero that can be used regardless of the quantization method?

yes, it's called AffineQuantizedTensor, we are putting all types of symmetric/asymmetric, per_tensor/channel/group/token, int8/int4/int3/... under this tensor subclass.

here is an model level API walk through using the tensor subclass: https://github.com/pytorch/ao/tree/main/torchao/quantization#quantization-flow

currently I'm working on replacing existing APIs with it: #294, after that I also plan to publish a more detailed tutorials to talk about how to implement a new data representation with tensor subclass, using this as an example

@msaroufim
Copy link
Member

@jeromeku regarding your other questions

  • To compose new dtypes with FSDP you can follow the playbook here [FSDP2][NF4Tensor][2/n] implement torch.chunk and other ops #150 we'll look to write some docs @weifengpy but let us know if you have any questions in the meantime
  • I guess regarding high pri dtypes, I'm not 100% sure yet cause it depends on what researchers might do - I'm personally biased to getting bitnet to work
  • Regarding fp8 training we're looking to centralize the fp8 work in this repo and @vkuzo is gonna be moving bits over time

@bhack
Copy link

bhack commented Jun 13, 2024

Are we going to support dynamic inputs?

@msaroufim
Copy link
Member

Hi @bhack! Seen you on a lot of threads. Could you share a bit more what you mean by dynamic inputs? Are you referring to dynamic shapes?

@bhack
Copy link

bhack commented Jun 14, 2024

Yes dynamic input shape but not mainly on the batch dimension. So e.g. image with different width and height

@msaroufim msaroufim unpinned this issue Jun 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants