Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

questions about top_down_transform #7

Open
wefwefWEF2 opened this issue Jan 15, 2024 · 1 comment
Open

questions about top_down_transform #7

wefwefWEF2 opened this issue Jan 15, 2024 · 1 comment

Comments

@wefwefWEF2
Copy link

Hi, thanks a lot for your great work, and about top_down_transform I have some questions.

Here, why we use top_down_transform to multiply with masked_x again, because we have already got the selected feature.

top_down_transform = prompt[..., None] @ prompt[..., None].transpose(-1, -2)
x = x @ top_down_transform * 5

@bfshi
Copy link
Owner

bfshi commented Mar 21, 2024

Hi, that's a good question. This part is for selecting the relevant features on the channel dimension while the previous selection is on the spatial dimension. We find selecting on both dimensions can enhance the effect of top down attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants