Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hi, very glad to see this new version of Swin-Trans. Could I have a question about using mixed-precision training #35

Open
SudongCAI opened this issue Apr 29, 2022 · 2 comments

Comments

@SudongCAI
Copy link

Dear author,

I notice that the repo recommends using apex mixed-precision for fine-tuning.
Then, how about learning from scratch on ImageNet-1k (should I also open the Apex mixed-precision training in this case)?
Previously, I found that mixed-precision could decrease the results for training CNNs on ImageNet if training from scratch.
Hence, I wonder whether mixed-precision training served as the default setting for the experiments of CSwins (or Swins).
Thank you so much!

@Andy1621
Copy link

Here I give some experience in my UniFormer, you can also follow our work to do it~

Mix-precision is a common trick for training Vision Transformer, in our experiments, it does not hurt the performance. Both mix-precision in Apex and Pytorch work!
But sometimes mix-precision will cause loss NAN, and layer scale is another trick to handle it.

@SudongCAI
Copy link
Author

Understood. Thanks so much for your kind reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants