Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question for the encoder_hidden_states #28

Open
WayneML opened this issue Jan 17, 2024 · 4 comments
Open

Question for the encoder_hidden_states #28

WayneML opened this issue Jan 17, 2024 · 4 comments

Comments

@WayneML
Copy link

WayneML commented Jan 17, 2024

When I try to run the script, I found the encoder_hidden_states to be zero.

@WayneML
Copy link
Author

WayneML commented Jan 18, 2024

if args.conditioning_dropout_prob is not None:
random_p = torch.rand(
bsz, device=latents.device, generator=generator)
# Sample masks for the edit prompts.
prompt_mask = random_p < 2 * args.conditioning_dropout_prob
prompt_mask = prompt_mask.reshape(bsz, 1, 1)
# Final text conditioning.
null_conditioning = torch.zeros_like(encoder_hidden_states)
encoder_hidden_states = torch.where(
prompt_mask, null_conditioning.unsqueeze(1), encoder_hidden_states.unsqueeze(1))

I found something strange in this code block,it seems that “random_p = torch.ran(bsz, device=latents.device, generator=generator)” always make random_p is one dimension and the value is 1.when you chose batch size is 1.
make prompt_mask one ture but not a list of Boolean type.
prompt_mask = random_p < 2 * args.conditioning_dropout_prob
prompt_mask = prompt_mask.reshape(bsz, 1, 1)
# Final text conditioning.
null_conditioning = torch.zeros_like(encoder_hidden_states)
encoder_hidden_states = torch.where(
prompt_mask, null_conditioning.unsqueeze(1), encoder_hidden_states.unsqueeze(1))

@WayneML
Copy link
Author

WayneML commented Jan 18, 2024

And is this still for image2video task? It seems that it is used for the text to image.

@pixeli99
Copy link
Owner

Hi, I didn't quite understand what you meant. Are you asking why the encoder_hidden_states need to be replaced with zeros?

@mmxbc1223
Copy link

Can the encoder_hidden_states be replaced with a text embedding for text-to-video tasks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants