Reapply Autoquant (#82) #109

cpuhrsch · 2024-04-01T22:20:19Z

No description provided.

Summary: Adding autoquantization functionality, using hte do_quant api we can test kernel speeds and pick the best quantization type (or no quantization) for each layer. Test Plan: python test/test.py -k "autoquant" also tested on SAM and SDXL pytorch-labs/segment-anything-fast#114 HDCharles/sdxl-fast@8d9942a Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

cpuhrsch · 2024-04-02T02:38:13Z

README.md

-torch._inductor.config.force_fuse_int_mm_with_mul = True
+# inductor settings which improve torch.compile performance for quantized modules
+torch._inductor.config.force_fuse_int_mm_with_mul
+torch._inductor.config.use_mixed_mm


This needs =True or perhaps with config etc.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 1, 2024

cpuhrsch added 9 commits April 1, 2024 15:42

Skip autoquant CPU tests

edd2708

Add more device skips

550d6af

Merge main

015b31a

Remove merge artifact

c14ab3d

Remove duplicate test

c40a175

Remove duplicate test

bc4eb7b

Reduce top level API

9b632e2

Version guards

9bc66b4

Change import path

e24c607

cpuhrsch commented Apr 2, 2024

View reviewed changes

cpuhrsch added 18 commits April 5, 2024 16:23

Clean up init

429fd86

Merge remote-tracking branch 'origin' into autoquant2

4a603d7

Merge remote-tracking branch 'origin' into autoquant2

d4b11bc

Clean up import

57d9ffa

Calm down test shapes and deal with imports

e3e82d9

Clean up import

62f3787

Multiple of 16

d2a573b

Version guards

74526f6

More parameterizations

a6fbe2f

More parameterizations

0f225ec

Merge branch 'main' of github.com:pytorch-labs/ao into autoquant2

907ca84

Run all tests

33bbc33

bfloat16 guard

f433553

Shape guards

e4c62e4

Working shapes

24d2bcc

Exclude huge shape

e75b68c

Exclude huge shape

0a908bf

Exclude huge shape

6207659

Update readme

a1b8a0c

cpuhrsch merged commit c403580 into main Apr 5, 2024
7 checks passed

dbyoung18 pushed a commit to dbyoung18/ao that referenced this pull request Jul 31, 2024

Reapply Autoquant (pytorch#82) (pytorch#109)

37ae1d2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reapply Autoquant (#82) #109

Reapply Autoquant (#82) #109

cpuhrsch commented Apr 1, 2024

cpuhrsch Apr 2, 2024

Reapply Autoquant (#82) #109

Reapply Autoquant (#82) #109

Conversation

cpuhrsch commented Apr 1, 2024

cpuhrsch Apr 2, 2024

Choose a reason for hiding this comment