MAE using pretrained VIT #196

Songloading · 2022-01-26T08:33:55Z

Hi There,

I am currently trying to fine-tune an MAE based on pretrained VIT from timm. However, when I do:

v = timm.create_model('vit_base_patch16_224', pretrained=True)
num_ftrs = v.head.in_features
v.head = nn.Linear(num_ftrs, 2) 
model = MAE(
    encoder = v,
    masking_ratio = 0.75,   # the paper recommended 75% masked patches
    decoder_dim = 512,      # paper showed good results with just 512
    decoder_depth = 6       # anywhere from 1 to 8
)

I got "AttributeError: 'VisionTransformer' object has no attribute 'pos_embedding'"
It seems that timm model is not compatible with the MAE implementation. Can this be easily fixed or I will have to change the internal implementation of MAE?

The text was updated successfully, but these errors were encountered:

lucidrains · 2022-01-26T17:31:35Z

@Songloading i have no idea! i'm not really familiar with timm - perhaps you can ask Ross about it?

Songloading · 2022-01-26T19:28:27Z

@lucidrains ok. Any ideas on what else pretrained VIT besides their implementation or pretrained MAE?

mw9385 · 2023-07-17T07:13:59Z

Did you solve the issue? @Songloading

mw9385 mentioned this issue Jul 17, 2023

MAE Training #271

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MAE using pretrained VIT #196

MAE using pretrained VIT #196

Songloading commented Jan 26, 2022

lucidrains commented Jan 26, 2022

Songloading commented Jan 26, 2022

mw9385 commented Jul 17, 2023

MAE using pretrained VIT #196

MAE using pretrained VIT #196

Comments

Songloading commented Jan 26, 2022

lucidrains commented Jan 26, 2022

Songloading commented Jan 26, 2022

mw9385 commented Jul 17, 2023