-
Notifications
You must be signed in to change notification settings - Fork 26.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Faster generation using AWQ + Fused modules (#27411)
* v1 fusing modules * add fused mlp support * up * fix CI * block save_pretrained * fixup * small fix * add new condition * add v1 docs * add some comments * style * fix nit * adapt from suggestion * add check * change arg names * change variables name * Update src/transformers/integrations/awq.py Co-authored-by: amyeroberts <[email protected]> * style * split up into 3 different private methods * more conditions * more checks * add fused tests for custom models * fix * fix tests * final update docs * final fixes * fix importlib metadata * Update src/transformers/utils/quantization_config.py Co-authored-by: amyeroberts <[email protected]> * change it to `do_fuse` * nit * Update src/transformers/utils/quantization_config.py Co-authored-by: Marc Sun <[email protected]> * Update src/transformers/utils/quantization_config.py Co-authored-by: Marc Sun <[email protected]> * Update src/transformers/utils/quantization_config.py Co-authored-by: Marc Sun <[email protected]> * few fixes * revert * fix test * fix copies * raise error if model is not quantized * add test * use quantization_config.config when fusing * Update src/transformers/modeling_utils.py --------- Co-authored-by: amyeroberts <[email protected]> Co-authored-by: Marc Sun <[email protected]>
- Loading branch information
1 parent
df40edf
commit fdb85be
Showing
7 changed files
with
623 additions
and
33 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.