-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about Training Time and Test Latency #22
Comments
Same here on single gpu A100 training takes about 5 days. |
Sorry for the delay. It takes us about 1 day to train YOLO-MS-XS (w/ SE attention) for 300 epochs. The training time issue may be caused by the version mismatch between mmcv and pytorch. We recommend using the mmcv==2.0.0rc4, pytorch==1.12.1, and cuda==11.6. Notable that the mmcv may be recompiled. Thanks for your interest in our work! |
I am actually using the latest versions ,could it be a batch size issue as I have had to set the batch size to fit ithe gpu memory. |
We have met this issue before, but we forget the details. We remember we solved the problem by downgrading the version of pytorch. |
Hi, I'm attempting to reproduce YOLO-MS-XS (with SE attn). It shows that training from scratch (300 epochs) will take almost 10 days. I'm using RTX 3090 * 8. The training command,
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash tools/dist_train.sh configs/yoloms/yoloms-xs-se_syncbn_fast_8xb8-300e_coco.py 8
, aligns with the README file.Is it common for training to take this long? Could you share your training time? btw, which GPU device are you using to test the model's latency?
The text was updated successfully, but these errors were encountered: