-
Notifications
You must be signed in to change notification settings - Fork 125
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* autoquant using aqt Summary: changing autoquant to use aqt instead of the old subclass subtensors changed aqt to first dispatch to a static _quantized_linear_op which then dispatches to the normal function. This way autoquant has an extention point to modify the kernel functions for various quantization modes without editing the main kernel function of all the classes. linear_activation_quantized_tensor got the same treatment. there were some transposes found in the aqt kernels not present in the subclass kernels, however they do not seen to affect performance (see benchmark_results.txt for an autoquant perf run) Test Plan: sh benchmarks.sh python test_integration.py Reviewers: Subscribers: Tasks: Tags:
- Loading branch information
Showing
4 changed files
with
86 additions
and
24 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters