You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have an RTX 2060 super with 8gb ram and running a 2 epoch, 75 step AdamW on 20 images,
I have noticed an issue when saving epoch steps during training, after the first epoch is saved to disk the training slows down and varies and altough the shell displays 1it/sec it takes a real time value of 7-8 sec per iteration, the first epoch took 10mins and the 2nd one took almost 4 hours. When I run it for 15 epochs /10 steps, 20 images without saving epochs each epoch takes 10mins... So the issue seems to occur after the epoch is saved to disk. Is it a known issue?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I have an RTX 2060 super with 8gb ram and running a 2 epoch, 75 step AdamW on 20 images,
I have noticed an issue when saving epoch steps during training, after the first epoch is saved to disk the training slows down and varies and altough the shell displays 1it/sec it takes a real time value of 7-8 sec per iteration, the first epoch took 10mins and the 2nd one took almost 4 hours. When I run it for 15 epochs /10 steps, 20 images without saving epochs each epoch takes 10mins... So the issue seems to occur after the epoch is saved to disk. Is it a known issue?
Beta Was this translation helpful? Give feedback.
All reactions