-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transformer Compression using SliceGPT #1052
Conversation
1b1a556
to
450a963
Compare
450a963
to
28b0060
Compare
28b0060
to
b3f02de
Compare
d069691
to
147545d
Compare
31cd97a
to
88556c5
Compare
Adding new pass to use SliceGPT compression technique to improve performance and reduce memory footprint. Updated phi2 example with a new workflow that uses the implemented pass.
88556c5
to
e19acb8
Compare
@staticmethod | ||
def _default_config(accelerator_spec: AcceleratorSpec) -> Dict[str, PassConfigParam]: | ||
return { | ||
"calibration_data_config": PassConfigParam( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only the data_config.name is used from this param. Like we discussed offline previously, wouldn't it be simpler to just have string param called calibration_dataset_name
or something?
The actual data config is not used at all. Having it here makes the config unnecessarily complicated. It also makes it appear any data config is supported when only three data names are supported by the tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will follow up on this once I have a discussion with Devang. I think a full data_config keeps the option open for future extension but again it can always be changed in future. Let me follow up after discussion.
Transformer Compression using SliceGPT
Adding new pass to use SliceGPT compression technique to improve performance and reduce memory footprint.
Updated phi2 example with a new workflow that uses the implemented pass.
Release Note: New pass
SliceGPT
to compress transformer to improve performance and reduce memory footprint.Checklist before requesting a review
lintrunner -a
(Optional) Issue link