New transformers caching ETA now v4.38

See huggingface#1252 for more context. The initial idea was for transformers 4.37 to add the new caching to all architectures, but this was postponed to 4.38. The code needs to be adapted for prompt tuning not to break when transformers 4.37 is released.
BenjaminBossan · Jan 11, 2024 · 09502c7 · 09502c7
1 parent 6451cbd
commit 09502c7
Showing 1 changed file with 3 additions and 3 deletions.
diff --git a/src/peft/peft_model.py b/src/peft/peft_model.py
@@ -1141,11 +1141,11 @@ def prepare_inputs_for_generation(self, *args, task_ids: torch.Tensor = None, **
 
         # https://github.com/huggingface/transformers/pull/26681/ introduced new cache format
         # for some architectures which requires a special fix for prompt tuning etc.
-        # TODO: starting with transformers 4.37, all architectures should support caching.
-        uses_transformers_4_37 = packaging.version.parse(transformers.__version__) >= packaging.version.parse("4.37.0")
+        # TODO: starting with transformers 4.38, all architectures should support caching.
+        uses_transformers_4_38 = packaging.version.parse(transformers.__version__) >= packaging.version.parse("4.38.0")
         uses_transformers_4_36 = packaging.version.parse(transformers.__version__) >= packaging.version.parse("4.36.0")
         transformers_new_cache_archs = ["llama", "mistral", "persimmon", "phi"]
-        uses_cache = uses_transformers_4_37 or (
+        uses_cache = uses_transformers_4_38 or (
             uses_transformers_4_36 and self.base_model.config.model_type in transformers_new_cache_archs
         )