Questions about Training Time and Test Latency #22

leonnil · 2024-03-12T11:47:17Z

Hi, I'm attempting to reproduce YOLO-MS-XS (with SE attn). It shows that training from scratch (300 epochs) will take almost 10 days. I'm using RTX 3090 * 8. The training command, CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash tools/dist_train.sh configs/yoloms/yoloms-xs-se_syncbn_fast_8xb8-300e_coco.py 8, aligns with the README file.

Is it common for training to take this long? Could you share your training time? btw, which GPU device are you using to test the model's latency?

The text was updated successfully, but these errors were encountered:

sparshgarg23 · 2024-04-23T20:25:23Z

Same here on single gpu A100 training takes about 5 days.

FishAndWasabi · 2024-05-30T07:04:00Z

Sorry for the delay. It takes us about 1 day to train YOLO-MS-XS (w/ SE attention) for 300 epochs. The training time issue may be caused by the version mismatch between mmcv and pytorch. We recommend using the mmcv==2.0.0rc4, pytorch==1.12.1, and cuda==11.6. Notable that the mmcv may be recompiled.

Thanks for your interest in our work!
Best Wishes! 😊

sparshgarg23 · 2024-05-30T07:48:16Z

I am actually using the latest versions ,could it be a batch size issue as I have had to set the batch size to fit ithe gpu memory.

FishAndWasabi · 2024-05-30T14:43:08Z

The batch size we used is 32. 4 images per GPU and 8 GPUs. The GPU is 3090.

FishAndWasabi · 2024-05-30T14:46:01Z

We have met this issue before, but we forget the details. We remember we solved the problem by downgrading the version of pytorch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about Training Time and Test Latency #22

Questions about Training Time and Test Latency #22

leonnil commented Mar 12, 2024 •

edited

Loading

sparshgarg23 commented Apr 23, 2024

FishAndWasabi commented May 30, 2024

sparshgarg23 commented May 30, 2024

FishAndWasabi commented May 30, 2024

FishAndWasabi commented May 30, 2024

Questions about Training Time and Test Latency #22

Questions about Training Time and Test Latency #22

Comments

leonnil commented Mar 12, 2024 • edited Loading

sparshgarg23 commented Apr 23, 2024

FishAndWasabi commented May 30, 2024

sparshgarg23 commented May 30, 2024

FishAndWasabi commented May 30, 2024

FishAndWasabi commented May 30, 2024

leonnil commented Mar 12, 2024 •

edited

Loading