Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unstable as backbone for semantic segmentation #9

Closed
shkarupa-alex opened this issue Mar 2, 2022 · 2 comments
Closed

Unstable as backbone for semantic segmentation #9

shkarupa-alex opened this issue Mar 2, 2022 · 2 comments

Comments

@shkarupa-alex
Copy link
Contributor

I've tried to use Van Large as backbone for binary semantic segmentation and found it very unstable.

Model: UpperNet. Previously well tested with Resnet50 and SwinBase.
Features: same as in https://github.com/Visual-Attention-Network/VAN-Segmentation

Just switching backbone to van large fails after 5 of 7 epochs: model generate nan outputs.
In the same time F1 and accuracy grows from 1 to 5 epochs (this is not a divergency).

Right now i have no time to dive deeper to find nan's source, so this is just a feedback.

@Andy1621
Copy link

Andy1621 commented Mar 2, 2022

@shkarupa-alex Such phenomenon is common in the transformer-style backbones. You can find the issue in DeiT.
In my experiments, layer scale sometimes causes this unstable training, which is used in VAN.

self.layer_scale_1 = nn.Parameter(
layer_scale_init_value * torch.ones((dim)), requires_grad=True)
self.layer_scale_2 = nn.Parameter(
layer_scale_init_value * torch.ones((dim)), requires_grad=True)

Maybe you can try our UniFormer, we have avoided some tricks that cause unstable training and all models/configs/logs are released.

@MenghaoGuo
Copy link
Contributor

Thanks all.

We do not find this problem in the traning process. We'll explore it, if we have time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants