Skip to content

Commit

Permalink
New transformers caching ETA now v4.38
Browse files Browse the repository at this point in the history
See huggingface#1252 for more context.

The initial idea was for transformers 4.37 to add the new caching to all
architectures, but this was postponed to 4.38. The code needs to be
adapted for prompt tuning not to break when transformers 4.37 is
released.
  • Loading branch information
BenjaminBossan committed Jan 11, 2024
1 parent 6451cbd commit 09502c7
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions src/peft/peft_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -1141,11 +1141,11 @@ def prepare_inputs_for_generation(self, *args, task_ids: torch.Tensor = None, **

# https://github.com/huggingface/transformers/pull/26681/ introduced new cache format
# for some architectures which requires a special fix for prompt tuning etc.
# TODO: starting with transformers 4.37, all architectures should support caching.
uses_transformers_4_37 = packaging.version.parse(transformers.__version__) >= packaging.version.parse("4.37.0")
# TODO: starting with transformers 4.38, all architectures should support caching.
uses_transformers_4_38 = packaging.version.parse(transformers.__version__) >= packaging.version.parse("4.38.0")
uses_transformers_4_36 = packaging.version.parse(transformers.__version__) >= packaging.version.parse("4.36.0")
transformers_new_cache_archs = ["llama", "mistral", "persimmon", "phi"]
uses_cache = uses_transformers_4_37 or (
uses_cache = uses_transformers_4_38 or (
uses_transformers_4_36 and self.base_model.config.model_type in transformers_new_cache_archs
)

Expand Down

0 comments on commit 09502c7

Please sign in to comment.