You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was excited to give this one a try based on the research paper. Testing a full checkpoint train in FLUX and was hoping to utilize this optimizers quicker convergence. However, I cannot seem to get it to operate within the constraints of 24GB VRAM (RTX4090). If you try to enable any of the block swap options, it errors as not available. Same for the paged version. Anyone have a config to utilize this on checkpoint training under 24gb vram? Am I missing an option? Is it supposed to work with block swap? I am not posting as an error because it could easily be my error. The people keeping this up are 10X smarter than I am on these things.
I have not tried it in LoRA training, but maybe it works better there? I welcome any suggestions. Training with ADAFACTOR now but convergence is slow even at a learning rate higher than I really want to be using.
The text was updated successfully, but these errors were encountered:
I was excited to give this one a try based on the research paper. Testing a full checkpoint train in FLUX and was hoping to utilize this optimizers quicker convergence. However, I cannot seem to get it to operate within the constraints of 24GB VRAM (RTX4090). If you try to enable any of the block swap options, it errors as not available. Same for the paged version. Anyone have a config to utilize this on checkpoint training under 24gb vram? Am I missing an option? Is it supposed to work with block swap? I am not posting as an error because it could easily be my error. The people keeping this up are 10X smarter than I am on these things.
I have not tried it in LoRA training, but maybe it works better there? I welcome any suggestions. Training with ADAFACTOR now but convergence is slow even at a learning rate higher than I really want to be using.
The text was updated successfully, but these errors were encountered: