-
-
Notifications
You must be signed in to change notification settings - Fork 426
Diffusers
SD.Next includes experimental support for additional model pipelines
This includes support for additional models such as:
- Stable Diffusion XL
- Kandinsky
- Deep Floyd IF
And soon:
- Shap-E, UniDiffuser, Consistency Models, Diffedit Zero-Shot
- Text2Video, Video2Video, etc...
Note that support is experimental, do not open GitHub issues for those models
and instead reach-out on Discord using dedicated #diffusers
channels
This has been made possible by integration of huggingface diffusers library with the help of huggingface team!
Initial support merged into dev
branch
- Download from branch:
git clone https://github.com/vladmandic/automatic -b dev diffusers
cd diffusers
- Install SD.Next
- Start with
webui --backend diffusers
- To go back to standard execution pipeline, start with
webui --backend original
- txt2txt
- img2img
- process
- For standard SD 1.5 and SD 2.1 models, you can use either
standard safetensor models (single file) or diffusers models (folder structure) - For additional models, you can use diffusers models only
- You can download diffuser models directly from Huggingface hub
or use built-in model search & download in SD.Next: UI -> Models -> Huggingface - Note that access to some models is gated
In which case, you need to accept model EULA and provide your huggingface token - When loading safetensors models, you must specify model pipeline type in:
UI -> Settings -> Diffusers -> Pipeline
When loading huggingface models, pipeline type is automatically detected
- Lora networks
- Textual inversions (embeddings)
Note that Lora and TI need are still model-specific, so you cannot use Lora trained on SD 1.5 on SD-XL
(just like you couldn't do it on SD 2.1 model) - it needs to be trained for a specific model
Support for SD-XL training is expected shortly
- UI -> Settings -> Diffuser Settings
contains additional tunable parameters
- Samplers (schedulers) are pipeline specific, so when running with diffuser backend, you'll see a different list of samplers
- UI -> Settings -> Sampler Settings shows different configurable parameters depending on backend
- Recommended sampler for diffusers is DEIS
- Updated System Info tab with additional information
- Support for
lowvram
andmedvram
modes
Additional tunables are available in UI -> Settings -> Diffuser Settings - Support for both default SDP and xFormers cross-optimizations
Other cross-optimization methods are not available - Extra Networks UI will show available diffusers models
-
CUDA model compile
UI Settings -> Compute settings
Requires GPU with high VRAM
Diffusers recommendreduce overhead
compile mode, but other methods are available as well
Fullgraph compile is possible (with sufficient vram) when using diffusers
- SD-XL Technical Report
- SD-XL model is designed as two-stage model
You can run SD-XL pipeline using justbase
model or load bothbase
andrefiner
models-
base
: Trained on images with variety of aspect ratios and uses OpenCLIP-ViT/G and CLIP-ViT/L for text encoding -
refiner
: Trained to denoise small noise levels of high quality data and uses the OpenCLIP model - Having both
base
model andrefiner
model loaded can require significant VRAM - If you want to use
refiner
model, it is advised to addsd_model_refiner
to quicksettings
in UI Settings -> User Interface
-
- SD-XL model was trained on 1024px images
You can use it with smaller sizes, but you will likely get better results with SD 1.5 models - SD-XL model NSFW filter has been turned off
- Go to UI -> Models -> Huggingface
- Enter
stabilityai/stable-diffusion-xl-base-0.9
in Select Model and press Download - Enter
stabilityai/stable-diffusion-xl-refiner-0.9
in Select Model and press Download
Do not attempt to use safetensors
version of SD-XL until full support is added (soon)
- Skip/Stop operations are not possible while running a diffusers model
- Any extension that requires access to model internals will likely not work when using diffusers backend
This for example includes standard extensions such asControlNet
,MultiDiffusion
,LyCORIS
Note: application will auto-disable incompatible built-in extensions when running in diffusers mode
If you go back to original mode, you will need to re-enable extensions - Explict
vae
usage is not yet implemented - Explicit
refiner
as postprocessing is not yet implemented - Second-pass workflows such as
hires fix
are not yet implemented - Hypernetworks
- Limited callbacks support for scripts/extensions: additional callbacks will be added as needed
Comparison of original stable diffusion pipeline and diffusers pipeline when using standard SD 1.5 model
Performance is measured for batch-size
1, 2, 4, 8 16
pipeline | performance it/s | memory cpu/gpu |
---|---|---|
original | 7.99 / 7.93 / 8.83 / 9.14 / 9.2 | 6.7 / 7.2 |
original medvram | 6.23 / 7.16 / 8.41 / 9.24 / 9.68 | 8.4 / 6.8 |
original lowvram | 1.05 / 1.94 / 3.2 / 4.81 / 6.46 | 8.8 / 5.2 |
diffusers | 9 / 7.4 / 8.2 / 8.4 / 7.0 | 4.3 / 9.0 |
diffusers medvram | 7.5 / 6.7 / 7.5 / 7.8 / 7.2 | 6.6 / 8.2 |
diffusers lowvram | 7.0 / 7.0 / 7.4 / 7.7 / 7.8 | 4.3 / 7.2 |
diffusers with safetensors | 8.9 / 7.3 / 8.1 / 8.4 / 7.1 | 5.9 / 9.0 |
Notes:
- Test environment: nVidia RTX 3060 GPU, Torch 2.1-nightly with CUDA 12.1, Cross-optimization: SDP
- All being equal, diffussers seem to:
- Use slightly less RAM and more VRAM
- Have highly efficient medvram/lowvram equivalents which don't lose a lot of performance
- Faster on smaller batch sizes, slower on larger batch sizes
© SD.Next