You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, thank you for bringing wonderful work!
I have the following questions and I hope you can provide assistance:
I only have one RTX3090 graphics card with 24GB of memory, so in the training configuration file of RGT:
num_worker_per_gpu: 16
batch_size_per_gpu: 8
The configuration method should be reduced, and I will try to enable training correctly:
num_worker_per_gpu: 6
batch_size_per_gpu: 4
But will this affect the performance of the indicators after training, and should I adjust other training parameters to achieve balance, such as adjusting the learning rate, increasing the number of training rounds, etc.
Looking forward to your reply~
The text was updated successfully, but these errors were encountered:
Yes, it will affect the training. Model training is based on steps (iterations) rather than epochs. Therefore, changes in batches and GPUs require adjustments to the iterations.
If your batch_size_per_gpu=4 and 1 GPU, compared to original setup of batch_size_per_gpu=8 and 4 GPUs, the amount of training data decreases by a factor of 8. You would need to multiply the iterations by 8 and adjust lr schedule accordingly. This might not guarantee exact reproduction of the results, but the difference shouldn't be significant.
PS: Considering your computing resources, you can consider using DAT-light.
Hello, thank you for bringing wonderful work!
I have the following questions and I hope you can provide assistance:
I only have one RTX3090 graphics card with 24GB of memory, so in the training configuration file of RGT:
num_worker_per_gpu: 16
batch_size_per_gpu: 8
The configuration method should be reduced, and I will try to enable training correctly:
num_worker_per_gpu: 6
batch_size_per_gpu: 4
But will this affect the performance of the indicators after training, and should I adjust other training parameters to achieve balance, such as adjusting the learning rate, increasing the number of training rounds, etc.
Looking forward to your reply~
The text was updated successfully, but these errors were encountered: