Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev merge #3388

Merged
merged 166 commits into from
Aug 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
166 commits
Select commit Hold shift + click to select a range
112b1d8
preenable kolors
vladmandic Jul 10, 2024
f3d4000
update changelog
vladmandic Jul 10, 2024
a696775
Add ROCm 6.1.2 support. (ZLUDA)
lshqqytiger Jul 10, 2024
affdb48
Revert "Add ROCm 6.1.2 support. (ZLUDA)"
lshqqytiger Jul 10, 2024
9e6fb52
add auraflow
vladmandic Jul 12, 2024
a2d0f61
fix lint
vladmandic Jul 12, 2024
2e995fb
update changelog
vladmandic Jul 12, 2024
72bd998
fallback to pip if uv failed
Yoinky3000 Jul 12, 2024
f151cdc
only fallback if return code isnt 0
Yoinky3000 Jul 12, 2024
5337ce2
Merge pull request #3332 from Yoinky3000/dev
vladmandic Jul 12, 2024
14e03d9
rewrite zluda installer
lshqqytiger Jul 12, 2024
0e04b2b
fix type hint
lshqqytiger Jul 12, 2024
a3495d1
Fix typo in `installer.py` logs
james-banks Jul 12, 2024
2a50e74
Merge pull request #3336 from james-banks/patch-3
vladmandic Jul 12, 2024
be3fbd7
zluda rocm6 support
lshqqytiger Jul 13, 2024
0b68077
zluda linux error message
lshqqytiger Jul 13, 2024
d4f6e3d
fix zluda installer
lshqqytiger Jul 16, 2024
14569ac
prevent segfault when no hip device found
lshqqytiger Jul 17, 2024
7eb7dd5
fix
lshqqytiger Jul 17, 2024
d3a7095
zluda better rocm detection
lshqqytiger Jul 20, 2024
0284c77
refactor rocm & zluda
lshqqytiger Jul 22, 2024
7fff2a7
fix linux
lshqqytiger Jul 22, 2024
4cf1340
just use rocm.is_installed
lshqqytiger Jul 22, 2024
25c3c61
fix
lshqqytiger Jul 22, 2024
9c1c8fe
NNCF fix AuraFlow
Disty0 Jul 22, 2024
f2769c0
ROCm flash atten fall back to sdpa with fp32 inputs
Disty0 Jul 22, 2024
4aabc8b
Add shift_factor to vae decode
Disty0 Jul 22, 2024
918a839
ROCm 6.1 switch to stable PyTorch
Disty0 Jul 25, 2024
4254256
Update the default ROCm ver to 6.1
Disty0 Jul 25, 2024
5a94741
rocm.py
lshqqytiger Jul 27, 2024
bdf6501
fix
lshqqytiger Jul 27, 2024
4492ded
fix hip path detection
lshqqytiger Jul 27, 2024
fa1e77c
Fix Full VAE previews
Disty0 Jul 27, 2024
3d0ba32
Fix Default scheduler not applying
Disty0 Jul 28, 2024
8dfc01d
update wiki
vladmandic Jul 28, 2024
f5f7ed2
experimental pytorch nightly xpu support
Disty0 Jul 30, 2024
6c75bcc
Optimum Quanto support
Disty0 Jul 30, 2024
b50a860
Fix T5 INT8 and add QINT8
Disty0 Jul 30, 2024
9965ef7
De-dupe Cascade
Disty0 Aug 1, 2024
bb707e4
FLUX support
Disty0 Aug 2, 2024
9e8ed74
ROCm add max version check
Disty0 Aug 3, 2024
8cc0354
Fix segfault with ROCm 6.2
Disty0 Aug 3, 2024
dc9e60a
Quant add shuffle models option
Disty0 Aug 4, 2024
7eacec4
Quant send to gpu with shuffle option on high vram systems
Disty0 Aug 4, 2024
82bcc2b
IPEX fix fp64 check
Disty0 Aug 5, 2024
fe88e84
IPEX fix diffusers import error
Disty0 Aug 5, 2024
33d80a3
IPEX fix FP64 error with FLUX
Disty0 Aug 5, 2024
c66aab2
Make ruff happy
Disty0 Aug 5, 2024
a17e452
Fix Cascade with long prompts
Disty0 Aug 6, 2024
600340d
Fix Cascade with custom samplers
Disty0 Aug 6, 2024
9818950
Cascade fixes
Disty0 Aug 7, 2024
1cf87ef
Change cascade load order
Disty0 Aug 7, 2024
a6b6d16
update requirements
vladmandic Aug 7, 2024
db0f6c7
Cascade fix get_timestep_ratio_conditioning
Disty0 Aug 7, 2024
0d57fa3
fix zluda torch cpp_extension
lshqqytiger Aug 8, 2024
61961f5
hipblaslt check torch version
lshqqytiger Aug 8, 2024
c490b9c
fix first launch
lshqqytiger Aug 8, 2024
e1c4038
zluda hijack torch jit
lshqqytiger Aug 8, 2024
6431296
Fix Cascade empty prompt encode
Disty0 Aug 8, 2024
58d49f2
Custom sampler support for Cascade Decoder
Disty0 Aug 9, 2024
2545285
skip hipblaslt check if no gpu detected
lshqqytiger Aug 9, 2024
e729b57
Make ruff happy
Disty0 Aug 9, 2024
864263f
accurate wsl check
lshqqytiger Aug 9, 2024
9c4213e
tcmalloc experiment
lshqqytiger Aug 9, 2024
a9f2799
enable diffusers_move_unet for Flux
lshqqytiger Aug 9, 2024
c2c4e17
enable FluxPipeline
lshqqytiger Aug 9, 2024
e3b087b
Add balanced offload mode and make offload modes a single choice list
Disty0 Aug 11, 2024
4ba1732
Check device index for balanced offload
Disty0 Aug 11, 2024
fb89e26
Auto detect memory size ffor balaced offload
Disty0 Aug 11, 2024
6a1af56
FLUX quant loading support
Disty0 Aug 11, 2024
da9c46c
Fix optimum-quanto not found
Disty0 Aug 11, 2024
70c2e84
Prompt cache support for Flux
Disty0 Aug 11, 2024
0f7b7e8
Fix memory detection when no gpu is present
Disty0 Aug 11, 2024
dcedcba
IPEX fixes
Disty0 Aug 12, 2024
6e97421
IPEX update to 2.1.40+xpu
Disty0 Aug 12, 2024
26d1d42
IPEX update interpolate hijack
Disty0 Aug 12, 2024
c6cd072
hip_visible_devices
lshqqytiger Aug 13, 2024
04f4757
Update to new huggingface stuff
osanseviero Aug 13, 2024
94fce42
Merge pull request #3368 from osanseviero/master
vladmandic Aug 13, 2024
7d805e9
there are no multiple models, so no need to check
AznamirWoW Aug 13, 2024
6757af5
Merge pull request #3370 from AznamirWoW/dev
vladmandic Aug 13, 2024
ef1dedf
Samplers prefer model defaults over diffusers defaults
Disty0 Aug 13, 2024
97a5fae
Fix qint4
Disty0 Aug 13, 2024
a73716b
Add meta to device check
Disty0 Aug 13, 2024
8619a7f
Better balanced offload
Disty0 Aug 14, 2024
7edc864
Add module name to disk offload path
Disty0 Aug 14, 2024
119c372
More offload checks
Disty0 Aug 14, 2024
8699e0c
Add gc to balanced offload
Disty0 Aug 14, 2024
237cab2
Add offload check to cascade's vqgan
Disty0 Aug 14, 2024
f3f721e
Quanto disable gemm kernels
Disty0 Aug 14, 2024
80be079
Flux load quant model to cpu
Disty0 Aug 15, 2024
7a6b45b
Balanced offload move device map calcs
Disty0 Aug 15, 2024
3f5c3ba
Add warning to Quanto with balanced and sequential offload
Disty0 Aug 15, 2024
04172e5
Quanto Lora support
Disty0 Aug 16, 2024
5a75b12
Fix Lora with Balanced Offlaod
Disty0 Aug 16, 2024
d1b87ef
Add Quanto Lora hijack additionally
Disty0 Aug 17, 2024
bce3c7e
Fix --Xvram flags not activating offload
Disty0 Aug 17, 2024
5c857e8
Add check for Flux attention processor
Disty0 Aug 17, 2024
a3f26c9
Convert Dynamic Attention SDP to a global SDP option
Disty0 Aug 17, 2024
0734c75
Add Heun FlowMatch
Disty0 Aug 17, 2024
a795770
Update changelog
Disty0 Aug 17, 2024
b862400
IPEX fix AMP custom_fwd
Disty0 Aug 18, 2024
5e1da44
IPEX fix custom_fwd x2
Disty0 Aug 18, 2024
2586a18
Don't add 0.1 to the GPU memory
Disty0 Aug 18, 2024
42fff22
Update CHANGELOG.md
Disty0 Aug 18, 2024
cc89ed8
Rename cpu offload to model offload
Disty0 Aug 19, 2024
b025d1d
Round memory size in settings
Disty0 Aug 19, 2024
6a58d52
Make eval use apply_compile_to_model
Disty0 Aug 19, 2024
16d6c03
Optimum Quanto activations support
Disty0 Aug 21, 2024
694d25c
Fix quanto
Disty0 Aug 21, 2024
c3ff21c
Quanto freeze the model before calibration
Disty0 Aug 21, 2024
2caf52a
Update Quanto settings names
Disty0 Aug 21, 2024
e40e13a
Quanto fix Flux activations
Disty0 Aug 21, 2024
b706083
Quanto Activations fix Diffuser's model offload bug
Disty0 Aug 21, 2024
963940b
Fix no half vae
Disty0 Aug 21, 2024
3c4d9f3
Change vae cast order
Disty0 Aug 21, 2024
3a97db1
Fix setting no-half-vae
Disty0 Aug 21, 2024
3165bbc
Flux quant loding detect dtype from state_dict
Disty0 Aug 21, 2024
35d70b3
Flux change no-half-vae check order
Disty0 Aug 22, 2024
02d6b67
Cascade decide atten mask value from the model name
Disty0 Aug 22, 2024
c1285c6
Cascade re-add empty embed provider
Disty0 Aug 23, 2024
ab9a4d3
update zluda
lshqqytiger Aug 24, 2024
517ee93
xhinker parser implementation
AI-Casanova Aug 25, 2024
1de7716
Remove print commands
AI-Casanova Aug 25, 2024
a5d1c65
update zluda (hip sdk 5, 6)
lshqqytiger Aug 26, 2024
5ed58ac
end-to-end update flux, see changelog and wiki
vladmandic Aug 28, 2024
ed52624
Merge branch 'dev' into xhinker
vladmandic Aug 28, 2024
527c05f
Merge pull request #3381 from AI-Casanova/xhinker
vladmandic Aug 28, 2024
707fc1d
flux prompt attention
vladmandic Aug 28, 2024
277d84c
fix control api
vladmandic Aug 28, 2024
b39abdd
control allow resizing overrides to match input
vladmandic Aug 28, 2024
35565fa
offload disabled controlnets
vladmandic Aug 28, 2024
65c137a
update
vladmandic Aug 28, 2024
bb7f84b
better api defaults
vladmandic Aug 28, 2024
0094838
Don't preload blaslt with ROCm 6.2
Disty0 Aug 28, 2024
db6235e
update changelog and requirements
vladmandic Aug 28, 2024
81d9af7
control tab include all scripts
vladmandic Aug 28, 2024
5503a3b
fix invalid resize mode
vladmandic Aug 28, 2024
768c7d0
update changelog
lshqqytiger Aug 28, 2024
73efa76
add taesd flux
vladmandic Aug 28, 2024
ce6224a
update changelog and readme
vladmandic Aug 28, 2024
01fc706
fix nvml gpu monitor
vladmandic Aug 29, 2024
8cf9590
flux add safetensors unet load
vladmandic Aug 29, 2024
4057a04
update wiki
vladmandic Aug 29, 2024
d0905a8
update notes
vladmandic Aug 29, 2024
4f606d3
update auraflow
vladmandic Aug 29, 2024
0b901e0
xhinker move te as needed
vladmandic Aug 29, 2024
db9de0d
prioritize given backend if use_* argument is presented
lshqqytiger Aug 29, 2024
68a8ffa
ipadapter optional face autocrop input image
vladmandic Aug 29, 2024
694acce
update changelog
vladmandic Aug 29, 2024
be0ea62
cleanup
vladmandic Aug 29, 2024
121fd43
Fix IPEX installer
Disty0 Aug 29, 2024
ec85ab4
cleanup
vladmandic Aug 29, 2024
2c8cb5c
vae exception handling
vladmandic Aug 29, 2024
81583ef
add flux offline config and fix vae config reference
vladmandic Aug 29, 2024
1114165
update changelog and cleanup
vladmandic Aug 30, 2024
ed45477
animatediff updates
vladmandic Aug 30, 2024
9da340f
bump up default resolution
vladmandic Aug 30, 2024
0a05826
fix gallery sort
vladmandic Aug 30, 2024
e3b54cf
fix control show extensions
vladmandic Aug 30, 2024
ec281d6
use peft for lora on non-sd models
vladmandic Aug 30, 2024
264460b
improve xyz grid for loras and add strength
vladmandic Aug 30, 2024
1b6ebd9
update diffusers
vladmandic Aug 31, 2024
a11a5f0
flux qint auto-download quantization map
vladmandic Aug 31, 2024
9d14258
flux nf4 offline load
vladmandic Aug 31, 2024
d2695f2
update changelog
vladmandic Aug 31, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 96 additions & 31 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,104 @@
# Change Log for SD.Next

## Update for 2024-07-09: WiP
## Update for 2024-08-31

### Pending
### Highlights for 2024-08-31

- Requires `diffusers==0.30.0`
- [AuraFlow/LavenderFlow](https://github.com/huggingface/diffusers/pull/8796) (previously known as LavenderFlow)
- [Kolors](https://github.com/huggingface/diffusers/pull/8812)
- [ControlNet Union](https://huggingface.co/xinsir/controlnet-union-sdxl-1.0) pipeline
- FlowMatchHeunDiscreteScheduler enable
Summer break is over and we are back with a massive update!

### Highlights
Support for all of the new models:
- [Black Forest Labs FLUX.1](https://blackforestlabs.ai/announcing-black-forest-labs/)
- [AuraFlow 0.3](https://huggingface.co/fal/AuraFlow)
- [AlphaVLLM Lumina-Next-SFT](https://huggingface.co/Alpha-VLLM/Lumina-Next-SFT-diffusers)
- [Kwai Kolors](https://huggingface.co/Kwai-Kolors/Kolors)
- [HunyuanDiT 1.2](https://huggingface.co/Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers)

What else? Just a bit... ;)

New **fast-install** mode, new **Optimum Quanto** and **BitsAndBytes** based quantization modes, new **balanced offload** mode that dynamically offloads GPU<->CPU as needed, and more...
And from previous service-pack: new **ControlNet-Union** *all-in-one* model, support for **DoRA** networks, additional **VLM** models, new **AuraSR** upscaler

**Breaking Changes...**

Due to internal changes, you'll need to reset your **attention** and **offload** settings!
But...For a good reason, new *balanced offload* is magic when it comes to memory utilization while sacrificing minimal performance!

Massive update to WiKi with over 20 new pages and articles, now includes guides for nearly all major features
Support for new models:
- [AlphaVLLM Lumina-Next-SFT](https://huggingface.co/Alpha-VLLM/Lumina-Next-SFT-diffusers)
- [Kwai Kolors](https://huggingface.co/Kwai-Kolors/Kolors)
- [HunyuanDiT 1.2](https://huggingface.co/Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers)
### Details for 2024-08-31

What else? Just a bit... ;)
New **fast-install** mode, new **controlnet-union** *all-in-one* model, support for **DoRA** networks, additional **VLM** models, new **AuraSR** upscaler, and more...
**New Models...**

### New Models
To use and of the new models, simply select model from *Networks -> Reference* and it will be auto-downloaded on first use

- [Black Forest Labs FLUX.1](https://blackforestlabs.ai/announcing-black-forest-labs/)
FLUX.1 models are based on a hybrid architecture of multimodal and parallel diffusion transformer blocks, scaled to 12B parameters and builing on flow matching
This is a very large model at ~32GB in size, its recommended to use a) offloading, b) quantization
For more information on variations, requirements, options, and how to donwload and use FLUX.1, see [Wiki](https://github.com/vladmandic/automatic/wiki/FLUX)
SD.Next supports:
- [FLUX.1 Dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) and [FLUX.1 Schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell) original variations
- additional [qint8](https://huggingface.co/Disty0/FLUX.1-dev-qint8) and [qint4](https://huggingface.co/Disty0/FLUX.1-dev-qint4) quantized variations
- additional [nf4](https://huggingface.co/sayakpaul/flux.1-dev-nf4) quantized variation
- [AuraFlow](https://huggingface.co/fal/AuraFlow)
AuraFlow v0.3 is the fully open-sourced largest flow-based text-to-image generation model
This is a very large model at 6.8B params and nearly 31GB in size, smaller variants are expected in the future
Use scheduler: Default or Euler FlowMatch or Heun FlowMatch
- [AlphaVLLM Lumina-Next-SFT](https://huggingface.co/Alpha-VLLM/Lumina-Next-SFT-diffusers)
to use, simply select from *networks -> reference
use scheduler: default or euler flowmatch or heun flowmatch
note: this model uses T5 XXL variation of text encoder
(previous version of Lumina used Gemma 2B as text encoder)
- [Kwai Kolors](https://huggingface.co/Kwai-Kolors/Kolors)
to use, simply select from *networks -> reference
note: this is an SDXL style model that replaces standard CLiP-L and CLiP-G text encoders with a massive `chatglm3-6b` encoder
however, this new encoder does support both English and Chinese prompting
- [HunyuanDiT 1.2](https://huggingface.co/Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers)
to use, simply select from *networks -> reference
Lumina-Next-SFT is a Next-DiT model containing 2B parameters, enhanced through high-quality supervised fine-tuning (SFT)
This model uses T5 XXL variation of text encoder (previous version of Lumina used Gemma 2B as text encoder)
Use scheduler: Default or Euler FlowMatch or Heun FlowMatch
- [Kwai Kolors](https://huggingface.co/Kwai-Kolors/Kolors)
Kolors is a large-scale text-to-image generation model based on latent diffusion
This is an SDXL style model that replaces standard CLiP-L and CLiP-G text encoders with a massive `chatglm3-6b` encoder supporting both English and Chinese prompting
- [HunyuanDiT 1.2](https://huggingface.co/Tencent-Hunyuan/HunyuanDiT-v1.2-Diffusers)
Hunyuan-DiT is a powerful multi-resolution diffusion transformer (DiT) with fine-grained Chinese understanding
- [AnimateDiff](https://github.com/guoyww/animatediff/)
support for additional models: **SD 1.5 v3** (Sparse), **SD Lightning** (4-step), **SDXL Beta**

**New Features...**

- support for **Balanced Offload**, thanks @Disty0!
balanced offload will dynamically split and offload models from the GPU based on the max configured GPU and CPU memory size
model parts that dont fit in the GPU will be dynamically sliced and offloaded to the CPU
see *Settings -> Diffusers Settings -> Max GPU memory and Max CPU memory*
*note*: recommended value for max GPU memory is ~80% of your total GPU memory
*note*: balanced offload will force loading LoRA with Diffusers method
*note*: balanced offload is not compatible with Optimum Quanto
- support for **Optimum Quanto** with 8 bit and 4 bit quantization options, thanks @Disty0 and @Trojaner!
to use, go to Settings -> Compute Settings and enable "Quantize Model weights with Optimum Quanto" option
*note*: Optimum Quanto requires PyTorch 2.4
- new prompt attention mode: **xhinker** which brings support for prompt attention to new models such as FLUX.1 and SD3
to use, enable in *Settings -> Execution -> Prompt attention*
- use [PEFT](https://huggingface.co/docs/peft/main/en/index) for **LoRA** handling on all models other than SD15/SD21/SDXL
this improves LoRA compatibility for SC, SD3, AuraFlow, Flux, etc.

**Changes & Fixes...**

- default resolution bumped from 512x512 to 1024x1024, time to move on ;)
- convert **Dynamic Attention SDP** into a global SDP option, thanks @Disty0!
*note*: requires reset of selected attention option
- update default **CUDA** version from 12.1 to 12.4
- update `requirements`
- samplers now prefers the model defaults over the diffusers defaults, thanks @Disty0!
- improve xyz grid for lora handling and add lora strength option
- don't enable Dynamic Attention by default on platforms that support Flash Attention, thanks @Disty0!
- convert offload options into a single choice list, thanks @Disty0!
*note*: requires reset of selected offload option
- control module allows reszing of indivudual process override images to match input image
for example: set size->before->method:nearest, mode:fixed or mode:fill
- control tab includes superset of txt and img scripts
- automatically offload disabled controlnet units
- prioritize specified backend if `--use-*` option is used, thanks @lshqqytiger
- ipadapter option to auto-crop input images to faces to improve efficiency of face-transfter ipadapters
- update **IPEX** to 2.1.40+xpu on Linux, thanks @Disty0!
- general **ROCm** fixes, thanks @lshqqytiger!
- support for HIP SDK 6.1 on ZLUDA backend, thanks @lshqqytiger!
- fix full vae previews, thanks @Disty0!
- fix default scheduler not being applied, thanks @Disty0!
- fix Stable Cascade with custom schedulers, thanks @Disty0!
- fix LoRA apply with force-diffusers
- fix LoRA scales with force-diffusers
- fix control API
- fix VAE load refrerencing incorrect configuration
- fix NVML gpu monitoring

## Update for 2024-07-08

Expand All @@ -57,13 +122,13 @@ This release is primary service release with cumulative fixes and several improv
**And fixes...**
- enable **Florence VLM** for all platforms, thanks @lshqqytiger!
- improve ROCm detection under WSL2, thanks @lshqqytiger!
- add SD3 with FP16 T5 to list of detected models
- add SD3 with FP16 T5 to list of detected models
- fix executing extensions with zero params
- add support for embeddings bundled in LoRA, thanks @AI-Casanova!
- add support for embeddings bundled in LoRA, thanks @AI-Casanova!
- fix executing extensions with zero params
- fix nncf for lora, thanks @Disty0!
- fix diffusers version detection for SD3
- fix current step for higher order samplers
- fix nncf for lora, thanks @Disty0!
- fix diffusers version detection for SD3
- fix current step for higher order samplers
- fix control input type video
- fix reset pipeline at the end of each iteration
- fix faceswap when no faces detected
Expand Down
7 changes: 6 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ All individual features are not listed here, instead check [ChangeLog](CHANGELOG
- Multiple UIs!
▹ **Standard | Modern**
- Multiple diffusion models!
▹ **Stable Diffusion 1.5/2.1/XL/3.0 | LCM | Lightning | Segmind | Kandinsky | Pixart-α | Pixart-Σ | Stable Cascade | Würstchen | aMUSEd | DeepFloyd IF | UniDiffusion | SD-Distilled | BLiP Diffusion | KOALA | SDXS | Hyper-SD | HunyuanDiT | etc.**
▹ **Stable Diffusion 1.5/2.1/XL/3.0 | LCM | Lightning | Segmind | Kandinsky | Pixart-α | Pixart-Σ | Stable Cascade | FLUX.1 | AuraFlow | Würstchen | Lumina | Kolors | aMUSEd | DeepFloyd IF | UniDiffusion | SD-Distilled | BLiP Diffusion | KOALA | SDXS | Hyper-SD | HunyuanDiT | etc.**
- Built-in Control for Text, Image, Batch and video processing!
▹ **ControlNet | ControlNet XS | Control LLLite | T2I Adapters | IP Adapters**
- Multiplatform!
Expand All @@ -53,6 +53,7 @@ All individual features are not listed here, instead check [ChangeLog](CHANGELOG
![Screenshot-Dark](html/screenshot-text2image.jpg)

*Main interface using **ModernUI***:
![Screenshot-Dark](html/screenshot-modernui-f1.jpg)
![Screenshot-Dark](html/screenshot-modernui.jpg)
![Screenshot-Dark](html/screenshot-modernui-sd3.jpg)

Expand All @@ -69,6 +70,10 @@ Additional models will be added as they become available and there is public int
- [StabilityAI Stable Diffusion 3 Medium](https://stability.ai/news/stable-diffusion-3-medium)
- [StabilityAI Stable Video Diffusion](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid) Base, XT 1.0, XT 1.1
- [LCM: Latent Consistency Models](https://github.com/openai/consistency_models)
- [Black Forest Labs FLUX.1](https://blackforestlabs.ai/announcing-black-forest-labs/) Dev, Schnell
- [AuraFlow](https://huggingface.co/fal/AuraFlow)
- [AlphaVLLM Lumina-Next-SFT](https://huggingface.co/Alpha-VLLM/Lumina-Next-SFT-diffusers)
- [Kwai Kolors](https://huggingface.co/Kwai-Kolors/Kolors)
- [Playground](https://huggingface.co/playgroundai/playground-v2-256px-base) *v1, v2 256, v2 512, v2 1024 and latest v2.5*
- [Stable Cascade](https://github.com/Stability-AI/StableCascade) *Full* and *Lite*
- [aMUSEd 256](https://huggingface.co/amused/amused-256) 256 and 512
Expand Down
14 changes: 4 additions & 10 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,14 @@ Main ToDo list can be found at [GitHub projects](https://github.com/users/vladma

## Future Candidates

- animatediff-sdxl <https://github.com/huggingface/diffusers/pull/6721>
- cogvideo-x: <https://huggingface.co/THUDM/CogVideoX-5b>
- animatediff prompt-travel: <https://github.com/huggingface/diffusers/pull/9231>
- async lowvram: <https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/14855>
- fp8: <https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/14031>
- ipadapter-negative: https://github.com/huggingface/diffusers/discussions/7167
- hd-painter: https://github.com/huggingface/diffusers/blob/main/examples/community/README.md#hd-painter
- init latents: variations, img2img
- diffusers public callbacks
- include reference styles
- lora: sc lora, etc

## Experimental

- [SDXL Flash Mini](https://huggingface.co/sd-community/sdxl-flash-mini)
SDXL type that weighs less, consumes less video memory, and the quality has not dropped much
to use, simply select from *networks -> models -> reference -> SDXL Flash Mini*
recommended parameters: steps: 6-9, cfg scale: 2.5-3.5, sampler: DPM++ SDE

### Missing

Expand Down
4 changes: 2 additions & 2 deletions cli/api-control.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ def get_image(encoded, output):


if __name__ == "__main__":
parser = argparse.ArgumentParser(description = 'api-img2img')
parser = argparse.ArgumentParser(description = 'api-control')
parser.add_argument('--init', required=False, default=None, help='init image')
parser.add_argument('--input', required=False, default=None, help='input image')
parser.add_argument('--mask', required=False, help='mask image')
Expand All @@ -148,5 +148,5 @@ def get_image(encoded, output):
parser.add_argument('--control', required=False, help='control units')
parser.add_argument('--ipadapter', required=False, help='ipadapter units')
args = parser.parse_args()
log.info(f'img2img: {args}')
log.info(f'api-control: {args}')
generate(args)
2 changes: 1 addition & 1 deletion cli/api-faceid.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ def generate(args): # pylint: disable=redefined-outer-name
parser.add_argument('--output', required=False, default=None, help='output image file')
parser.add_argument('--model', required=False, help='model name')
args = parser.parse_args()
log.info(f'img2img: {args}')
log.info(f'api-faceid: {args}')
generate(args)

"""
Expand Down
59 changes: 59 additions & 0 deletions cli/api-faces.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
#!/usr/bin/env python
import os
import io
import base64
import logging
import argparse
import requests
import urllib3
from PIL import Image

sd_url = os.environ.get('SDAPI_URL', "http://127.0.0.1:7860")
sd_username = os.environ.get('SDAPI_USR', None)
sd_password = os.environ.get('SDAPI_PWD', None)

logging.basicConfig(level = logging.INFO, format = '%(asctime)s %(levelname)s: %(message)s')
log = logging.getLogger(__name__)
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)


def auth():
if sd_username is not None and sd_password is not None:
return requests.auth.HTTPBasicAuth(sd_username, sd_password)
return None


def post(endpoint: str, dct: dict = None):
req = requests.post(f'{sd_url}{endpoint}', json = dct, timeout=300, verify=False, auth=auth())
if req.status_code != 200:
return { 'error': req.status_code, 'reason': req.reason, 'url': req.url }
else:
return req.json()


def encode(f):
image = Image.open(f)
if image.mode == 'RGBA':
image = image.convert('RGB')
with io.BytesIO() as stream:
image.save(stream, 'JPEG')
image.close()
values = stream.getvalue()
encoded = base64.b64encode(values).decode()
return encoded


def detect(args): # pylint: disable=redefined-outer-name
data = post('/sdapi/v1/faces', { 'image': encode(args.image) })
for face in zip(data['images'], data['scores']):
log.info(f'Face: score={face[1]}')
image = Image.open(io.BytesIO(base64.b64decode(face[0])))
image.save(f'/tmp/face_{face[1]}.jpg')


if __name__ == "__main__":
parser = argparse.ArgumentParser(description = 'api-faces')
parser.add_argument('--image', required=True, help='input image')
args = parser.parse_args()
log.info(f'api-faces: {args}')
detect(args)
2 changes: 1 addition & 1 deletion cli/api-img2img.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,5 +94,5 @@ def generate(args): # pylint: disable=redefined-outer-name
parser.add_argument('--output', required=False, default=None, help='output image file')
parser.add_argument('--model', required=False, help='model name')
args = parser.parse_args()
log.info(f'img2img: {args}')
log.info(f'api-img2img: {args}')
generate(args)
2 changes: 1 addition & 1 deletion cli/api-info.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,5 +53,5 @@ def info(args): # pylint: disable=redefined-outer-name
parser = argparse.ArgumentParser(description = 'api-info')
parser.add_argument('--input', required=True, help='input image')
args = parser.parse_args()
log.info(f'info: {args}')
log.info(f'api-info: {args}')
info(args)
2 changes: 1 addition & 1 deletion cli/api-json.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ def post(endpoint: str, payload: dict = None):


if __name__ == "__main__":
parser = argparse.ArgumentParser(description = 'api-txt2img')
parser = argparse.ArgumentParser(description = 'api-json')
parser.add_argument('endpoint', nargs=1, help='endpoint')
parser.add_argument('json', nargs=1, help='json data or file')
args = parser.parse_args()
Expand Down
2 changes: 1 addition & 1 deletion cli/api-mask.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,5 +79,5 @@ def info(args): # pylint: disable=redefined-outer-name
parser.add_argument('--type', required=False, help='output mask type')
parser.add_argument('--output', required=False, help='output image')
args = parser.parse_args()
log.info(f'info: {args}')
log.info(f'api-mask: {args}')
info(args)
2 changes: 1 addition & 1 deletion cli/api-preprocess.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,5 +72,5 @@ def info(args): # pylint: disable=redefined-outer-name
parser.add_argument('--model', required=True, help='preprocessing model')
parser.add_argument('--output', required=False, help='output image')
args = parser.parse_args()
log.info(f'info: {args}')
log.info(f'api-preprocess: {args}')
info(args)
2 changes: 1 addition & 1 deletion cli/api-txt2img.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,5 +80,5 @@ def generate(args): # pylint: disable=redefined-outer-name
parser.add_argument('--output', required=False, default=None, help='output image file')
parser.add_argument('--model', required=False, help='model name')
args = parser.parse_args()
log.info(f'txt2img: {args}')
log.info(f'api-txt2img: {args}')
generate(args)
2 changes: 1 addition & 1 deletion cli/api-upscale.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,5 +86,5 @@ def upscale(args): # pylint: disable=redefined-outer-name
parser.add_argument('--upscaler', required=False, default='Nearest', help='upscaler name')
parser.add_argument('--scale', required=False, default=2, help='upscaler scale')
args = parser.parse_args()
log.info(f'upscale: {args}')
log.info(f'api-upscale: {args}')
upscale(args)
2 changes: 1 addition & 1 deletion cli/api-vqa.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,5 +60,5 @@ def info(args): # pylint: disable=redefined-outer-name
parser.add_argument('--model', required=False, help='vqa model')
parser.add_argument('--question', required=False, help='question')
args = parser.parse_args()
log.info(f'info: {args}')
log.info(f'api-vqa: {args}')
info(args)
2 changes: 1 addition & 1 deletion cli/hf-search.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,5 @@
library=['diffusers'],
)
res = hf_api.list_models(filter=model_filter, full=True, limit=50, sort="downloads", direction=-1)
models = [{ 'name': m.modelId, 'downloads': m.downloads, 'mtime': m.lastModified, 'url': f'https://huggingface.co/{m.modelId}', 'pipeline': m.pipeline_tag, 'tags': m.tags } for m in res]
models = [{ 'name': m.id, 'downloads': m.downloads, 'mtime': m.lastModified, 'url': f'https://huggingface.co/{m.id}', 'pipeline': m.pipeline_tag, 'tags': m.tags } for m in res]
print(models)
32 changes: 32 additions & 0 deletions configs/flux/model_index.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
{
"_class_name": "FluxPipeline",
"_diffusers_version": "0.30.0.dev0",
"scheduler": [
"diffusers",
"FlowMatchEulerDiscreteScheduler"
],
"text_encoder": [
"transformers",
"CLIPTextModel"
],
"text_encoder_2": [
"transformers",
"T5EncoderModel"
],
"tokenizer": [
"transformers",
"CLIPTokenizer"
],
"tokenizer_2": [
"transformers",
"T5TokenizerFast"
],
"transformer": [
"diffusers",
"FluxTransformer2DModel"
],
"vae": [
"diffusers",
"AutoencoderKL"
]
}
11 changes: 11 additions & 0 deletions configs/flux/scheduler/scheduler_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"_class_name": "FlowMatchEulerDiscreteScheduler",
"_diffusers_version": "0.30.0.dev0",
"base_image_seq_len": 256,
"base_shift": 0.5,
"max_image_seq_len": 4096,
"max_shift": 1.15,
"num_train_timesteps": 1000,
"shift": 1.0,
"use_dynamic_shifting": false
}
Loading