Triplane Concatenation and Module Groups #21

Chrixtar · 2023-08-14T15:49:27Z

Hello Hansheng,

thank you very much for this clean codebase, great work!

If I am not mistaken, the denoising UNet is the typical DDPM architecture but expecting concatenated triplanes instead of images.
Geometrically, this concatenation and the resulting kernel sharing within the convolutional layers is not intuitive in my opinion.
Do you see what I mean or should I elaborate on this?

In the code, I have seen that you have also overridden all mmgen modules (MultiHeadAttention, DenoisingResBlock etc.) in order to make them grouped operations. It seems like you have also tried to denoise the planes individually.
If this is the case, I am very curious about the results, how they compare with denoising the triplanes jointly, and your interpretation of them :)

Again, thanks for your efforts.
Best regards
Chris

Lakonik · 2023-08-16T04:58:51Z

Hi Chris, thanks for your interest in our work!

We did try grouped operations in some early experiments, but to no avail. Currently we settled for either stacked triplanes (concatenating the channel dimension) or tiled triplanes (a.k.a. rollout, i.e., concatenating the spatial dimension).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Triplane Concatenation and Module Groups #21

Triplane Concatenation and Module Groups #21

Chrixtar commented Aug 14, 2023

Lakonik commented Aug 16, 2023 •

edited

Loading

Triplane Concatenation and Module Groups #21

Triplane Concatenation and Module Groups #21

Comments

Chrixtar commented Aug 14, 2023

Lakonik commented Aug 16, 2023 • edited Loading

Lakonik commented Aug 16, 2023 •

edited

Loading