Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triplane Concatenation and Module Groups #21

Open
Chrixtar opened this issue Aug 14, 2023 · 1 comment
Open

Triplane Concatenation and Module Groups #21

Chrixtar opened this issue Aug 14, 2023 · 1 comment

Comments

@Chrixtar
Copy link

Hello Hansheng,

thank you very much for this clean codebase, great work!

If I am not mistaken, the denoising UNet is the typical DDPM architecture but expecting concatenated triplanes instead of images.
Geometrically, this concatenation and the resulting kernel sharing within the convolutional layers is not intuitive in my opinion.
Do you see what I mean or should I elaborate on this?

In the code, I have seen that you have also overridden all mmgen modules (MultiHeadAttention, DenoisingResBlock etc.) in order to make them grouped operations. It seems like you have also tried to denoise the planes individually.
If this is the case, I am very curious about the results, how they compare with denoising the triplanes jointly, and your interpretation of them :)

Again, thanks for your efforts.
Best regards
Chris

@Lakonik
Copy link
Owner

Lakonik commented Aug 16, 2023

Hi Chris, thanks for your interest in our work!

We did try grouped operations in some early experiments, but to no avail. Currently we settled for either stacked triplanes (concatenating the channel dimension) or tiled triplanes (a.k.a. rollout, i.e., concatenating the spatial dimension).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants