AdEMAMix8bit settings for FLUX #2872

pAInCREAT0R · 2024-09-30T14:05:24Z

I was excited to give this one a try based on the research paper. Testing a full checkpoint train in FLUX and was hoping to utilize this optimizers quicker convergence. However, I cannot seem to get it to operate within the constraints of 24GB VRAM (RTX4090). If you try to enable any of the block swap options, it errors as not available. Same for the paged version. Anyone have a config to utilize this on checkpoint training under 24gb vram? Am I missing an option? Is it supposed to work with block swap? I am not posting as an error because it could easily be my error. The people keeping this up are 10X smarter than I am on these things.

I have not tried it in LoRA training, but maybe it works better there? I welcome any suggestions. Training with ADAFACTOR now but convergence is slow even at a learning rate higher than I really want to be using.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AdEMAMix8bit settings for FLUX #2872

AdEMAMix8bit settings for FLUX #2872

pAInCREAT0R commented Sep 30, 2024

AdEMAMix8bit settings for FLUX #2872

AdEMAMix8bit settings for FLUX #2872

Comments

pAInCREAT0R commented Sep 30, 2024