Question about the training parameters #1

zjcs · 2020-11-21T08:20:20Z

Hello, @feymanpriv , thank you very much for your work.

I am trying to review your code, here I want to know some key important parameters in your work:

As claimed in delg paper, only 15M step(about 25epoch) is needed in GLDv2Clean-Train-80percent dataset, while the max epoch in your config file is 100, do you train 100 epochs finnally to get your result?

Is there any modification in your implemention which is different to origin implemention by tensorflow ?

feymanpriv · 2020-11-22T16:11:05Z

Yes, we trained 100 epochs. In fact, I have tried the tf implementation, it dit not work well and the batch size could only be set to a small value and the training speed is slow. So i guess some details are different from the author. We used cosine lr in the experiment.

zjcs · 2020-11-23T10:06:25Z

@feymanpriv , thank you for your reply.

Your code is very clear to read, thanks for your work. I still have some questions:

about the learning rate and batchsize: As DELG said, the recommanded hyperparameter is 8 Tesla P100 GPUs: --batch_size=256, --initial_lr=0.01 or 4 Tesla P100 GPUs: --batch_size=128, --initial_lr=0.005, while your config file is 8GPU: --batch_size=64, --initial_lr=0.01 and --batch_size should be 256 due to the implemention in loader.construct_train_loader.
can you tell me more result about the difference of train with 25epochs or 100epochs?
You said, the tf implemention donot work well, do you use the recommand parameters? and what about the result after 25epochs?
what do you mean " the batch size could only be set to a small value"?

feymanpriv · 2020-11-24T03:52:48Z

@zjcs thank you for your attention

I directly use the batch size 256 and lr 0.1 in my training
2 I haven't tried 25 epochs
3 When i use the tf, i also train for 100 epochs with the setting the paper mentioned, but the result is not available to achieve the model that google team released
4 Also, in the tf training, large bs is out of memory while torch is not
There still exists some questions in this, it will be helpful if you check the code and training.

zjcs · 2020-11-24T07:30:48Z

@feymanpriv

Your reply is very helpful to me, thank you very much, I will try later.

Have a nice day~

feymanpriv closed this as completed Dec 3, 2020

feymanpriv reopened this Dec 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the training parameters #1

Question about the training parameters #1

zjcs commented Nov 21, 2020

feymanpriv commented Nov 22, 2020

zjcs commented Nov 23, 2020

feymanpriv commented Nov 24, 2020

zjcs commented Nov 24, 2020

Question about the training parameters #1

Question about the training parameters #1

Comments

zjcs commented Nov 21, 2020

feymanpriv commented Nov 22, 2020

zjcs commented Nov 23, 2020

feymanpriv commented Nov 24, 2020

zjcs commented Nov 24, 2020