You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
The docs (e.g. for mistral forward method) state that :
If past_key_values is used, optionally only the last decoder_input_ids have to be input (see past_key_values).
If past_key_values are used, the user can optionally input only the last input_ids (those that don’t have their past key value states given to this model) of shape (batch_size, 1) instead of all input_ids of shape (batch_size, sequence_length).
Yes, you're right! In case if we have a past_key_values, it's required to crop the inputs to the unprocessed tokens only. There are some differences in how the inputs are cropped if we're using SinkCache object, but the general rule is all new tokens that are not in the kv-cache yet.
Thanks for noting the discrepancy in docs, I will update docs regarding past_kv this week!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
Current documentation
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
The docs (e.g. for mistral forward method) state that :
https://huggingface.co/docs/transformers/main/model_doc/mistral#transformers.MistralModel.forward
Expected behavior
It is my understanding that it is in fact not optional but obligatory to pass only the last input ids (those that don’t have their past key value states given to this model), as there is no handling of the case where full input ids are passed. C.f. https://discuss.huggingface.co/t/correct-input-ids-when-passing-past-key-values/92044
The text was updated successfully, but these errors were encountered: