-
Notifications
You must be signed in to change notification settings - Fork 26.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AttributeError: 'tuple' object has no attribute 'to_legacy_cache' #28045
Comments
hi @wuxb45 |
I have the same issue with transofrmers 4.36.1. I am using DeepSpeed framework to generate a response and face this the same error. |
We cannot help you if you don't share a reproducible snippet. The way this part of the code works should not trigger this error because the past key values are casted to the |
Me too. I also struggled with this problem for a long time, using deepspeed-chat to train reinforcement learning code.@liziniu |
I don't have the capacity to generate a reprod at this time. The issue was from running a code base forked from deepspeed chat's step 3. I'm sorry that I cannot provide more information now. |
I solved this by removing tensor parallel. It seems that merging perdevicetensor converted Cache to Tuple. |
HI, i also faced the same issue. May I ask how you actually removed the tensor parallel if you are also using the deepspeed chat code? |
I had the same errors with |
I tried version |
Hi everyone, please let us know whenever you can share a small reproducible snippet as we can't do anything without a repro to fix the bug |
You can probably try the following code with different transformers versions to reproduce:
To me, only |
Alright this is pretty much a duplicate of #28003. We made a mistake by not advertising to test a bit more for other repos to get ready, feel free to share it on the |
To me, it works with |
Apparently this issue was introduced due to this this commit PR #26681 by @tomaarsen and @patrickvonplaten next_decoder_cache should be a cache, which means it is not well initialized as a cache. Instead of a tuple , the new HF implementation pass a list of cache: https://github.com/tomaarsen/transformers/blob/ee60b1cc13e2819ef31e69952c0b6f616bd724b8/src/transformers/models/llama/modeling_llama.py#L287C45-L287C76
layer_idx: Optional[int] = None
#https://github.com/tomaarsen/transformers/blob/ee60b1cc13e2819ef31e69952c0b6f616bd724b8/src/transformers/models/llama/modeling_llama.py#L355
past_key_value: Optional[Cache] = None, Layer_idx is later used by Note the diff contains a kind of cache (for attention KV cache) which implements I guess deepspeed version does not instantiate llama attention correctly or we should change the code as @fxmarty suggests:
|
您好,邮箱主人会认真阅读!谢谢关注/
|
I am facing a similar issue AttributeError: 'tuple' object has no attribute 'to_legacy_cache' while training Llama 7B. What is the concluded solution? |
If you did not change your version of |
transformers==4.35.2 works |
My transformers version is 4.45.0.dev0. It works |
已收到
|
System Info
transformers 4.36.1.
This error pops up when running inference with llama 2 model with the new tranformers 4.36.1. I didn't test 4.36.0. It was running correctly with 4.35.x.
This seems to be related to changes from #26681, and commit 633215b.
@ArthurZucker and @younesbelkada according to suggestions in "Who can help?"
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Sorry that I don't have an easy reprod now. Here is the relavant stack trace:
Expected behavior
Crash with the provided stack track.
The text was updated successfully, but these errors were encountered: