Transformer Compression using SliceGPT #1052

shaahji · 2024-04-04T16:47:50Z

Transformer Compression using SliceGPT

Adding new pass to use SliceGPT compression technique to improve performance and reduce memory footprint.
Updated phi2 example with a new workflow that uses the implemented pass.

Release Note: New pass SliceGPT to compress transformer to improve performance and reduce memory footprint.

Checklist before requesting a review

Add unit tests for this change.
Make sure all tests can pass.
Update documents if necessary.
Lint and apply fixes to your code by running lintrunner -a
Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

(Optional) Issue link

olive/passes/pytorch/slicegpt.py

docs/source/features/passes/pytorch.md

examples/phi2/phi2_slicegpt.json

examples/phi2/phi2.py

docs/source/features/passes/pytorch.md

olive/passes/pytorch/slicegpt.py

examples/phi2/phi2.py

Adding new pass to use SliceGPT compression technique to improve performance and reduce memory footprint. Updated phi2 example with a new workflow that uses the implemented pass.

jambayk · 2024-04-09T21:55:33Z

olive/passes/pytorch/slicegpt.py

+    @staticmethod
+    def _default_config(accelerator_spec: AcceleratorSpec) -> Dict[str, PassConfigParam]:
+        return {
+            "calibration_data_config": PassConfigParam(


only the data_config.name is used from this param. Like we discussed offline previously, wouldn't it be simpler to just have string param called calibration_dataset_name or something?

The actual data config is not used at all. Having it here makes the config unnecessarily complicated. It also makes it appear any data config is supported when only three data names are supported by the tool.

I will follow up on this once I have a discussion with Devang. I think a full data_config keeps the option open for future extension but again it can always be changed in future. Let me follow up after discussion.

shaahji marked this pull request as draft April 4, 2024 16:47

github-advanced-security bot found potential problems Apr 4, 2024

View reviewed changes

olive/passes/pytorch/slicegpt.py Fixed Show fixed Hide fixed

shaahji force-pushed the shaahji/slicegpt branch from 1b1a556 to 450a963 Compare April 4, 2024 21:21

shaahji changed the title ~~Introducing SliceGPT pass~~ Transformer Compression using SliceGPT Apr 4, 2024

shaahji marked this pull request as ready for review April 4, 2024 21:23

shaahji force-pushed the shaahji/slicegpt branch from 450a963 to 28b0060 Compare April 4, 2024 21:59

devang-ml reviewed Apr 4, 2024

View reviewed changes

docs/source/features/passes/pytorch.md Outdated Show resolved Hide resolved

examples/phi2/phi2_slicegpt.json Outdated Show resolved Hide resolved

shaahji force-pushed the shaahji/slicegpt branch from 28b0060 to b3f02de Compare April 5, 2024 07:31

github-advanced-security bot found potential problems Apr 5, 2024

View reviewed changes

examples/phi2/phi2.py Fixed Show fixed Hide fixed

shaahji force-pushed the shaahji/slicegpt branch 2 times, most recently from d069691 to 147545d Compare April 5, 2024 08:11

devang-ml reviewed Apr 8, 2024

View reviewed changes

docs/source/features/passes/pytorch.md Outdated Show resolved Hide resolved

olive/passes/pytorch/slicegpt.py Show resolved Hide resolved

shaahji force-pushed the shaahji/slicegpt branch 6 times, most recently from 31cd97a to 88556c5 Compare April 9, 2024 07:04

devang-ml previously approved these changes Apr 9, 2024

View reviewed changes

jambayk reviewed Apr 9, 2024

View reviewed changes

examples/phi2/phi2.py Outdated Show resolved Hide resolved

jambayk reviewed Apr 9, 2024

View reviewed changes

examples/phi2/phi2.py Outdated Show resolved Hide resolved

jambayk reviewed Apr 9, 2024

View reviewed changes

examples/phi2/phi2.py Show resolved Hide resolved

Transformer Compression using SliceGPT

e19acb8

Adding new pass to use SliceGPT compression technique to improve performance and reduce memory footprint. Updated phi2 example with a new workflow that uses the implemented pass.

shaahji dismissed devang-ml’s stale review via e19acb8 April 9, 2024 17:38

shaahji force-pushed the shaahji/slicegpt branch from 88556c5 to e19acb8 Compare April 9, 2024 17:38

devang-ml approved these changes Apr 9, 2024

View reviewed changes

jambayk reviewed Apr 9, 2024

View reviewed changes

shaahji merged commit ffd1d8f into main Apr 9, 2024
33 checks passed

shaahji deleted the shaahji/slicegpt branch April 9, 2024 22:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transformer Compression using SliceGPT #1052

Transformer Compression using SliceGPT #1052

shaahji commented Apr 4, 2024 •

edited

Loading

jambayk Apr 9, 2024 •

edited

Loading

shaahji Apr 9, 2024

Transformer Compression using SliceGPT #1052

Transformer Compression using SliceGPT #1052

Conversation

shaahji commented Apr 4, 2024 • edited Loading