You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The tokenizer pads the input towards the left side. When I change the argument padding="max_length" to generate inputs without padding, I get different tokens as prediction.
Using model.generate gives the same issue as well.
The text was updated successfully, but these errors were encountered:
Mamba, contrarily to transformers models, does not take an attention mask as input (see the signature here). As such, it does not support padding, and will return different values.
(I'm going to open a PR to try to prevent this issue from happening again)
System Info
transformers
version: 4.41.2Who can help?
@ArthurZucker @gante
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I have trained a MambaForCausalLM model on a custom dataset.
I am using the following code to generate next token in eval mode -
Expected behavior
The tokenizer pads the input towards the left side. When I change the argument padding="max_length" to generate inputs without padding, I get different tokens as prediction.
Using model.generate gives the same issue as well.
The text was updated successfully, but these errors were encountered: