Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long Pauses Between Epochs #36

Closed
official-elinas opened this issue Jan 10, 2023 · 4 comments
Closed

Long Pauses Between Epochs #36

official-elinas opened this issue Jan 10, 2023 · 4 comments

Comments

@official-elinas
Copy link

official-elinas commented Jan 10, 2023

I noticed that there are long pauses between epochs, around 35-40 seconds. What is the reasoning behind this and is there a way to lower it significantly or disable it entirely?

I looked through the code and was not able to find anything specific to pausing between epochs. The other db extension that is for auto's ui has an option to pause between epochs (default 60s) but can be lowered to 1s if desired.

This also applies to native training.

Edit: Looks like it hangs on line 206 in train_db.py every epoch.

Thanks.

@bmaltais
Copy link
Owner

This is something related with the latest accelerate version... I can't do much about it until kohya upgrade accelerate to another version as part of his supported code base... but glad you raised it so others are not taken aback by it ;-)

@official-elinas
Copy link
Author

official-elinas commented Jan 24, 2023

So I'm a bit confused. The latest version of accelerate is 0.15.0 (https://pypi.org/project/accelerate/). What do you mean by upgrading to another version when this is the latest version? I feel this might not be an accelerate issue.

@DarkAlchy
Copy link

DarkAlchy commented Jan 30, 2023

kohya-ss/sd-scripts#125

It is using the CPU far too much as the memory usage ramps up, slowly, then it dumps to CUDA, then epoch where the cycle starts over. Considering Automatic1111 works fine, and is using the GPU (1 gig more of it) I can say it is something in Kohya's code.

@official-elinas
Copy link
Author

official-elinas commented Feb 2, 2023

I definitely agree. I also do not want to use repeats as epochs is a better measurement of progress, and repeats produce different results than just using epochs. It almost seems like a workaround for whatever is wrong with the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants