New Release #3398
vladmandic
announced in
Announcements
New Release
#3398
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Update for 2024-08-31
Highlights for 2024-08-31
Summer break is over and we are back with a massive update!
Support for all of the new models:
What else? Just a bit... ;)
New fast-install mode, new Optimum Quanto and BitsAndBytes based quantization modes, new balanced offload mode that dynamically offloads GPU<->CPU as needed, and more...
And from previous service-pack: new ControlNet-Union all-in-one model, support for DoRA networks, additional VLM models, new AuraSR upscaler
Breaking Changes...
Due to internal changes, you'll need to reset your attention and offload settings!
But...For a good reason, new balanced offload is magic when it comes to memory utilization while sacrificing minimal performance!
Details for 2024-08-31
New Models...
To use and of the new models, simply select model from Networks -> Reference and it will be auto-downloaded on first use
FLUX.1 models are based on a hybrid architecture of multimodal and parallel diffusion transformer blocks, scaled to 12B parameters and builing on flow matching
This is a very large model at ~32GB in size, its recommended to use a) offloading, b) quantization
For more information on variations, requirements, options, and how to donwload and use FLUX.1, see Wiki
SD.Next supports:
AuraFlow v0.3 is the fully open-sourced largest flow-based text-to-image generation model
This is a very large model at 6.8B params and nearly 31GB in size, smaller variants are expected in the future
Use scheduler: Default or Euler FlowMatch or Heun FlowMatch
Lumina-Next-SFT is a Next-DiT model containing 2B parameters, enhanced through high-quality supervised fine-tuning (SFT)
This model uses T5 XXL variation of text encoder (previous version of Lumina used Gemma 2B as text encoder)
Use scheduler: Default or Euler FlowMatch or Heun FlowMatch
Kolors is a large-scale text-to-image generation model based on latent diffusion
This is an SDXL style model that replaces standard CLiP-L and CLiP-G text encoders with a massive
chatglm3-6b
encoder supporting both English and Chinese promptingHunyuan-DiT is a powerful multi-resolution diffusion transformer (DiT) with fine-grained Chinese understanding
support for additional models: SD 1.5 v3 (Sparse), SD Lightning (4-step), SDXL Beta
New Features...
balanced offload will dynamically split and offload models from the GPU based on the max configured GPU and CPU memory size
model parts that dont fit in the GPU will be dynamically sliced and offloaded to the CPU
see Settings -> Diffusers Settings -> Max GPU memory and Max CPU memory
note: recommended value for max GPU memory is ~80% of your total GPU memory
note: balanced offload will force loading LoRA with Diffusers method
note: balanced offload is not compatible with Optimum Quanto
to use, go to Settings -> Compute Settings and enable "Quantize Model weights with Optimum Quanto" option
note: Optimum Quanto requires PyTorch 2.4
to use, enable in Settings -> Execution -> Prompt attention
this improves LoRA compatibility for SC, SD3, AuraFlow, Flux, etc.
Changes & Fixes...
note: requires reset of selected attention option
requirements
note: requires reset of selected offload option
for example: set size->before->method:nearest, mode:fixed or mode:fill
--use-*
option is used, thanks @lshqqytigerBeta Was this translation helpful? Give feedback.
All reactions