Unstable as backbone for semantic segmentation #9

shkarupa-alex · 2022-03-02T06:58:21Z

I've tried to use Van Large as backbone for binary semantic segmentation and found it very unstable.

Model: UpperNet. Previously well tested with Resnet50 and SwinBase.
Features: same as in https://github.com/Visual-Attention-Network/VAN-Segmentation

Just switching backbone to van large fails after 5 of 7 epochs: model generate nan outputs.
In the same time F1 and accuracy grows from 1 to 5 epochs (this is not a divergency).

Right now i have no time to dive deeper to find nan's source, so this is just a feedback.

Andy1621 · 2022-03-02T11:42:16Z

@shkarupa-alex Such phenomenon is common in the transformer-style backbones. You can find the issue in DeiT.
In my experiments, layer scale sometimes causes this unstable training, which is used in VAN.

VAN-Classification/models/van.py

Lines 97 to 100 in d94f156

    
           self.layer_scale_1 = nn.Parameter( 
        
               layer_scale_init_value * torch.ones((dim)), requires_grad=True) 
        
           self.layer_scale_2 = nn.Parameter( 
        
               layer_scale_init_value * torch.ones((dim)), requires_grad=True)

Maybe you can try our UniFormer, we have avoided some tricks that cause unstable training and all models/configs/logs are released.

MenghaoGuo · 2022-03-02T14:55:50Z

Thanks all.

We do not find this problem in the traning process. We'll explore it, if we have time.

MenghaoGuo closed this as completed Mar 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unstable as backbone for semantic segmentation #9

Unstable as backbone for semantic segmentation #9

shkarupa-alex commented Mar 2, 2022

Andy1621 commented Mar 2, 2022

MenghaoGuo commented Mar 2, 2022

Unstable as backbone for semantic segmentation #9

Unstable as backbone for semantic segmentation #9

Comments

shkarupa-alex commented Mar 2, 2022

Andy1621 commented Mar 2, 2022

MenghaoGuo commented Mar 2, 2022