LR Scheduler and Optimizer #11

lukas-blecher · 2021-05-05T16:10:49Z

lukas-blecher
May 5, 2021
Maintainer

Ideas for training speed up by choosing optimal optimization algorithms.

As optimizer I only ever tried Adam and AdamW. They seem to perform quite well but I had the feeling that AdamW is a better fit.

Now for the LR Scheduler.
It has a quite big effect on the training progress. I've mostly used OneCycleLR until now. But the loss either stagnated or got even worse after some time. That's why I continued the training after a couple of epochs with a "fresh" OneCycle.
Maybe using a cyclic scheduler from the start would be the way to go. Something like CosineAnnealingLR.
Does anybody have experience with other schedulers/optimizers?

jalola · 2021-11-01T11:37:32Z

jalola
Nov 1, 2021

I am current training as well, I got BLEU=0.89 1 time, but then it fluctuated between 0.7 to 0.8x all the time.
I used different optimizer (AdamW) with very small lr (0.0001). When I use Adam, the accuracy fluctuate much more (0.4-0.7)

I guess there is something to do with the lr as you mentioned. I am planning to make it faster if possible for deployment.

2 replies

lukas-blecher Nov 4, 2021
Maintainer Author

Thank you for trying to improve the training. I'm very interested in the progress.
Once I have the resources again I will try out what you are suggesting.

Please keep me posted if you find something!

oakmoos Dec 15, 2023

Excuse me, if you used a self-generated dataset or the dataset provided by this project for training? I trained using the dataset provided by the project and found that progress was very slow without loading any checkpoints/(ㄒoㄒ)/~~. Thank you!

lukas-blecher · 2021-11-24T18:24:59Z

lukas-blecher
Nov 24, 2021
Maintainer Author

I've managed to achieve a good performance and loss without restarts by replacing the OneCycleLR with the StepLR.
Commit coming soon

2 replies

tuiiitendinh Sep 22, 2022

Could you please show us the configuration of OneCycleLR? I have encountered the error as the picture below

lukas-blecher Sep 23, 2022
Maintainer Author

I don't use OneCycleLR anymore and it's not supported without changing the code, sorry.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LR Scheduler and Optimizer #11

{{title}}

Replies: 2 comments 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

LR Scheduler and Optimizer #11

lukas-blecher May 5, 2021 Maintainer

Replies: 2 comments · 4 replies

jalola Nov 1, 2021

lukas-blecher Nov 4, 2021 Maintainer Author

oakmoos Dec 15, 2023

lukas-blecher Nov 24, 2021 Maintainer Author

tuiiitendinh Sep 22, 2022

lukas-blecher Sep 23, 2022 Maintainer Author

lukas-blecher
May 5, 2021
Maintainer

Replies: 2 comments 4 replies

jalola
Nov 1, 2021

lukas-blecher Nov 4, 2021
Maintainer Author

lukas-blecher
Nov 24, 2021
Maintainer Author

lukas-blecher Sep 23, 2022
Maintainer Author