You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Just switching backbone to van large fails after 5 of 7 epochs: model generate nan outputs.
In the same time F1 and accuracy grows from 1 to 5 epochs (this is not a divergency).
Right now i have no time to dive deeper to find nan's source, so this is just a feedback.
The text was updated successfully, but these errors were encountered:
@shkarupa-alex Such phenomenon is common in the transformer-style backbones. You can find the issue in DeiT.
In my experiments, layer scale sometimes causes this unstable training, which is used in VAN.
I've tried to use Van Large as backbone for binary semantic segmentation and found it very unstable.
Model: UpperNet. Previously well tested with Resnet50 and SwinBase.
Features: same as in https://github.com/Visual-Attention-Network/VAN-Segmentation
Just switching backbone to van large fails after 5 of 7 epochs: model generate nan outputs.
In the same time F1 and accuracy grows from 1 to 5 epochs (this is not a divergency).
Right now i have no time to dive deeper to find nan's source, so this is just a feedback.
The text was updated successfully, but these errors were encountered: