Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'Tensor' object has no attribute 'config' #3

Open
WenshuangSong opened this issue Apr 9, 2024 · 14 comments
Open

AttributeError: 'Tensor' object has no attribute 'config' #3

WenshuangSong opened this issue Apr 9, 2024 · 14 comments

Comments

@WenshuangSong
Copy link

WenshuangSong commented Apr 9, 2024

when I run "python train.py --config ./configs/config.yaml", I got the flowing error:

File "/home/ubuntu/us/project/MotionInversion/train.py", line 463, in
main(config)
File "/home/ubuntu/us/project/MotionInversion/train.py", line 407, in main
log_validation(
File "/home/ubuntu/us/project/MotionInversion/train.py", line 84, in log_validation
video_frames = pipeline(
File "/home/ubuntu/anaconda3/envs/sdwebui/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/ubuntu/anaconda3/envs/sdwebui/lib/python3.10/site-packages/diffusers/pipelines/text_to_video_synthesis/pipeline_text_to_video_synth.py", line 644, in call
prompt_embeds, negative_prompt_embeds = self.encode_prompt(
File "/home/ubuntu/anaconda3/envs/sdwebui/lib/python3.10/site-packages/diffusers/pipelines/text_to_video_synthesis/pipeline_text_to_video_synth.py", line 290, in encode_prompt
prompt_embeds = self.text_encoder(text_input_ids.to(device), attention_mask=attention_mask)
File "/home/ubuntu/anaconda3/envs/sdwebui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/ubuntu/anaconda3/envs/sdwebui/lib/python3.10/site-packages/accelerate/utils/operations.py", line 581, in forward
return model_forward(*args, **kwargs)
File "/home/ubuntu/anaconda3/envs/sdwebui/lib/python3.10/site-packages/accelerate/utils/operations.py", line 569, in call
return convert_to_fp32(self.model_forward(*args, **kwargs))
File "/home/ubuntu/anaconda3/envs/sdwebui/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
return func(*args, **kwargs)
File "/home/ubuntu/anaconda3/envs/sdwebui/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 818, in forward
return_dict = return_dict if return_dict is not None else self.config.use_return_dict
AttributeError: 'Tensor' object has no attribute 'config'

And my diffusers==0.26.3 transformers==4.27.4
When I print "self" in File "/home/ubuntu/anaconda3/envs/sdwebui/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 818,
I found self is CLIPTextModel when not at checkpointing_steps as follows:
CLIPTextModel(
(text_model): CLIPTextTransformer(
(embeddings): CLIPTextEmbeddings(
(token_embedding): Embedding(49408, 1024)
(position_embedding): Embedding(77, 1024)
)
(encoder): CLIPEncoder(
(layers): ModuleList(
(0-22): 23 x CLIPEncoderLayer(
(self_attn): CLIPAttention(
(k_proj): Linear(in_features=1024, out_features=1024, bias=True)
(v_proj): Linear(in_features=1024, out_features=1024, bias=True)
(q_proj): Linear(in_features=1024, out_features=1024, bias=True)
(out_proj): Linear(in_features=1024, out_features=1024, bias=True)
)
(layer_norm1): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
(mlp): CLIPMLP(
(activation_fn): GELUActivation()
(fc1): Linear(in_features=1024, out_features=4096, bias=True)
(fc2): Linear(in_features=4096, out_features=1024, bias=True)
)
(layer_norm2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
)
)
(final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
)
)

But "self" is a tensor at checkpointing_steps as follows:
tensor([[49406, 320, 31777, 15939, 2528, 320, 1305, 3980, 49407, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0]], device='cuda:0')

@wileewang
Copy link
Collaborator

It seems to be related to the version of transformers. Actually, the latest version of transformers also works. You can have a try.

@WenshuangSong
Copy link
Author

It seems to be related to the version of transformers. Actually, the latest version of transformers also works. You can have a try.

But I have try transformers-4.39.3, which also doesn't work.

@wileewang
Copy link
Collaborator

does it report the same error?

@WenshuangSong
Copy link
Author

does it report the same error?

yes

@WenshuangSong
Copy link
Author

does it report the same error?

after saved the checkpoint at the first checkpointing_step

@wileewang
Copy link
Collaborator

It's a little weird. You can disable checkpointing operation by setting the checkpointing_steps in config.yaml and try to load the motion embedding via the infererence code. I will check it later.

@WenshuangSong
Copy link
Author

It's a little weird. You can disable checkpointing operation by setting the checkpointing_steps in config.yaml and try to load the motion embedding via the infererence code. I will check it later.

Yes,I have set checkpointing_steps: 200 and max_train_steps: 200 . But is this operation will affect the final effect ?And my results are as follows:

"A knight in armor rides a Segway",

tmp_yhbhzx1.mp4

"A cat in armor driving a go-kart",

1.mp4

@wileewang
Copy link
Collaborator

I can't see you results

@WenshuangSong
Copy link
Author

I can't see you results

"A toy train chugs around a roundabout tree"
https://github.com/WenshuangSong/file/blob/main/2.mp4

"A cat in armor driving a go-kart",
https://github.com/WenshuangSong/file/blob/main/1.mp4

"A knight in armor rides a Segway",
https://github.com/WenshuangSong/file/blob/main/tmp_yhbhzx1.mp4

"A teddy bear is riding a tricycle in Times Square"
https://github.com/WenshuangSong/file/blob/main/3.mp4

@wileewang
Copy link
Collaborator

It looks like you are not using any noise initialization strategy. The quality of video model generation strongly depends on the initial noise, which is discussed in our paper and other related literature. Since our motion embedding parameters are very limited, it is not recommended to use it alone.

Alternatively, if you wish to use motion embedding purely for video customization, you will need to update config.yaml to enlarge the size of the motion embedding by including 320 into the dim parameter and change the loss type to BaseLoss. Note that doing so also increases the risk of overfitting.

@WenshuangSong
Copy link
Author

It looks like you are not using any noise initialization strategy. The quality of video model generation strongly depends on the initial noise, which is discussed in our paper and other related literature. Since our motion embedding parameters are very limited, it is not recommended to use it alone.

Alternatively, if you wish to use motion embedding purely for video customization, you will need to update config.yaml to enlarge the size of the motion embedding by including 320 into the dim parameter and change the loss type to BaseLoss. Note that doing so also increases the risk of overfitting.

Thanks for your recommendation. I have used the noise initialization strategy, and I used the training input video as input for the initialization video, which is not sure whether reasonable or not. But I still can't get a reasonable result. Here is my test result.
I used the longboard-24 video as my training source video, which was also used as input for the noise initialization strategy when at the inference stage:

https://github.com/WenshuangSong/file/blob/main/longboard-24%20(1).mp4

at the inference stage, my prompt is "A pigeon is strutting around a town square",
and my results are as follows:
https://github.com/WenshuangSong/file/blob/main/6.mp4
https://github.com/WenshuangSong/file/blob/main/7.mp4

It doesn't seem as reasonable as the results on your project page. Did something go wrong? Thanks a lot for your reply.

@wileewang
Copy link
Collaborator

I can't check your errors based on the results alone. Were you able to successfully run the checkpoint steps in your training? It is recommended to follow this process completely for inference. Also, you can wait for us to release the online gradio demo if you still in trouble with the AttributeError.

@WenshuangSong
Copy link
Author

I can't check your errors based on the results alone. Were you able to successfully run the checkpoint steps in your training? It is recommended to follow this process completely for inference. Also, you can wait for us to release the online gradio demo if you still in trouble with the AttributeError.

No, I can't successfully run the checkpoint steps in my training stage. So, when will the online gradio demo be released?Thanks~

@wileewang
Copy link
Collaborator

We will release it as soon as possible, please be patient.:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants