Cache: use `batch_size` instead of `max_batch_size` #32657

gante · 2024-08-13T14:41:51Z

What does this PR do?

Renames the input argument max_batch_size, present in static-shaped caches, to batch_size. max_batch_size is imprecise: the cache needs the EXACT batch size being used.

The imprecise variable name and description was a source of issues, e.g. here.

NOTE: while it is technically feasible to accept smaller batch sizes in static-shaped caches, we would be slicing the cache at each layer. Slicing is an expensive operation, and the point of using static-shaped caches is to be fast. In other words, we would be enabling a silent incorrect usage of the class 🤗

✅ all chances are backwards compatible, and the user will only see a warning if passing the batch size through the deprecated keyword argument

from transformers import AutoConfig, StaticCache

config = AutoConfig.from_pretrained("gpt2")

# No warnings
StaticCache(config, 8, 100, "cpu")
StaticCache(config=config, batch_size=8, max_cache_len=100, device="cpu")

# Warnings
StaticCache(config=config, max_batch_size=8, max_cache_len=100, device="cpu")

HuggingFaceDocBuilderDev · 2024-08-13T15:11:59Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp

Great, thanks for making it clearer!

ArthurZucker

thanks for updating, we could keep the name and make sure the description is better as well!

ArthurZucker · 2024-08-16T09:11:31Z

src/transformers/cache_utils.py

+        if max_batch_size is not None:
+            logger.warning_once(
+                f"The 'max_batch_size' argument of {self.__class__.__name__} is deprecated and will be removed in "
+                "v4.46. Use the more precisely named 'batch_size' argument instead."
+            )
+


code on the hub will complain but yes it makes sense

src/transformers/cache_utils.py

Co-authored-by: Arthur <[email protected]>

gante requested review from ArthurZucker and zucchini-nlp August 13, 2024 14:42

gante added 2 commits August 13, 2024 15:45

more precise name

eab18fd

better docstrings

093b100

gante force-pushed the cache_var_name branch from 8056fcc to 093b100 Compare August 13, 2024 15:45

zucchini-nlp approved these changes Aug 14, 2024

View reviewed changes

ArthurZucker mentioned this pull request Aug 16, 2024

Make StaticCache configurable at model construct time #32830

Merged

4 tasks

ArthurZucker approved these changes Aug 16, 2024

View reviewed changes

Update src/transformers/cache_utils.py

fe00763

Co-authored-by: Arthur <[email protected]>

gante merged commit cf32ee1 into huggingface:main Aug 16, 2024
25 checks passed

gante deleted the cache_var_name branch August 16, 2024 10:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache: use `batch_size` instead of `max_batch_size` #32657

Cache: use `batch_size` instead of `max_batch_size` #32657

gante commented Aug 13, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Aug 13, 2024

zucchini-nlp left a comment

ArthurZucker left a comment

ArthurZucker Aug 16, 2024

Cache: use batch_size instead of max_batch_size #32657

Cache: use batch_size instead of max_batch_size #32657

Conversation

gante commented Aug 13, 2024 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Aug 13, 2024

zucchini-nlp left a comment

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker Aug 16, 2024

Choose a reason for hiding this comment

Cache: use `batch_size` instead of `max_batch_size` #32657

Cache: use `batch_size` instead of `max_batch_size` #32657

gante commented Aug 13, 2024 •

edited

Loading