Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S-CFG, Optimizations, and More #37

Merged
merged 83 commits into from
May 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
a9656f5
add module_hooks.py
v0xie May 2, 2024
df35a5a
add function to add forward hook
v0xie May 2, 2024
c0007e8
remove self arg from modules_add_field
v0xie May 2, 2024
1766d21
add save_attn_maps
v0xie May 2, 2024
1f0c1f3
update save_attn_maps script
v0xie May 3, 2024
063d848
update saving logic
v0xie May 3, 2024
c73d5cb
remove warning if removing non-existent field
v0xie May 7, 2024
ae34cb4
saving attention maps
v0xie May 7, 2024
b48a995
multiply by text embeddings for cross-attention
v0xie May 8, 2024
47fc67a
add prompt_utils.py
v0xie May 8, 2024
93cfe63
update save format
v0xie May 8, 2024
a8cbe9a
change default module_name_filter value
v0xie May 9, 2024
7fd4fc4
working on reshaping attn maps to fit algo
v0xie May 9, 2024
3d248e4
fix scfg by calculating attention_probs intermediate step
v0xie May 9, 2024
ac30e4b
working implementation
v0xie May 9, 2024
fb8df16
reimplementing in scfg.py
v0xie May 9, 2024
54f2151
refactor in progress
v0xie May 9, 2024
8fd8724
fixes
v0xie May 9, 2024
db8e8c5
fix parameters
v0xie May 9, 2024
5ce2991
rescale masks if wrong shape
v0xie May 9, 2024
f5c2980
add license and credit
v0xie May 9, 2024
1009f5a
slider 0.1 increments
v0xie May 9, 2024
95b79f3
restore original pag.py
v0xie May 9, 2024
f2017cd
reimplement pag
v0xie May 9, 2024
7e175c4
test fix for non square resolution
v0xie May 10, 2024
49b4026
refactoring R
v0xie May 10, 2024
add83f0
refactor saved attn maps
v0xie May 12, 2024
2fe3b97
remove unused fields, unhook in postprocess
v0xie May 12, 2024
89b21c9
set r to number of attn map resolutions to aggregate
v0xie May 12, 2024
12bc8d2
fix self-attn divisor, rate*scfg_scale before smooth
v0xie May 12, 2024
604ca4e
optimization: only add to_k_map field to cross-attn modules
v0xie May 12, 2024
f0474a7
fix saveattnmaps postprocess running without token count
v0xie May 13, 2024
603c9cd
load saveattnmaps script when INCANT_DEBUG env set
v0xie May 13, 2024
6aa847f
Merge pull request #34 from v0xie/refactor/hooks
v0xie May 13, 2024
3323f8c
Merge branch 'dev' into scfg-2
v0xie May 13, 2024
14ff662
revert pag change
v0xie May 13, 2024
274578f
R is a hyperparam not related to attn maps
v0xie May 14, 2024
4f56dfa
add modules to run last, add cfg-combiner
v0xie May 14, 2024
e8324cd
adding script callback hooks to cfg combiner
v0xie May 14, 2024
ea22ef0
cfg combiner handling pag guidance
v0xie May 14, 2024
46c6340
cfg combiner handles scheduled cfg
v0xie May 14, 2024
b989431
clarify how pag is calculated
v0xie May 14, 2024
e418c5b
Delete scfg.py
v0xie May 14, 2024
875e6de
Merge pull request #35 from v0xie/scfg-2
v0xie May 14, 2024
5eaaa03
Merge branch 'dev' into cfg-combine
v0xie May 14, 2024
a60fe29
refactor inner loop of combined_denoised into scfg_combine_denoised
v0xie May 14, 2024
15c624c
experimental formulation for combining s-cfg/scheduled cfg/pag
v0xie May 14, 2024
ea66b6a
fix s-cfg rate
v0xie May 14, 2024
b602f91
memory optimization, scfg_scale 0 is CFG
v0xie May 14, 2024
37a6032
correct scfg active string
v0xie May 14, 2024
43f403b
fix nan when dividing by 1e+8
v0xie May 14, 2024
3ffbc3e
add torch gc justincase
v0xie May 14, 2024
c8c0a75
fix pag condition
v0xie May 14, 2024
e163e5d
Merge pull request #36 from v0xie/cfg-combine
v0xie May 14, 2024
e5a67c7
save 0.6gb by deleting attn maps after used
v0xie May 15, 2024
8caa74a
remove profile import/calls
v0xie May 15, 2024
2fe515a
fix rate parameter names
v0xie May 15, 2024
62e1852
refactor repeated functions, alternate commented out flow
v0xie May 16, 2024
22e401f
fix map variable initializations
v0xie May 16, 2024
4de9d00
don't clone tensors produces same output
v0xie May 16, 2024
de7dd83
refactor to cleanup comments / unused code
v0xie May 16, 2024
2b5658d
disable profiler
v0xie May 16, 2024
5996596
replace hook fns with module_hooks fns
v0xie May 16, 2024
646fd58
reorder ui, only apply clamp rate if > 0
v0xie May 16, 2024
fd17202
semi-decouple cfg scheduler from pag
v0xie May 16, 2024
b56533a
Merge pull request #39 from v0xie/profile
v0xie May 16, 2024
034d796
Update README.md
MYusufY May 16, 2024
4f27820
update README.md
MYusufY May 16, 2024
b62f3f4
update README.md
MYusufY May 16, 2024
b0ecce1
Merge pull request #41 from MYusufY/master
v0xie May 16, 2024
d132f5a
fix: avoid nan in scfg in attn score calculations
v0xie May 18, 2024
9fcee5a
add debug statistics
v0xie May 18, 2024
11f696a
initialize smoothing in scfg_params
v0xie May 18, 2024
8a82d80
fix adding field to wrong modules
v0xie May 18, 2024
98a0675
Merge pull request #44 from v0xie/fix/scfg-stability
v0xie May 18, 2024
80724b0
Merge branch 'master' into dev
v0xie May 18, 2024
96f2341
update README.md install section
v0xie May 18, 2024
8995f49
update README.md
v0xie May 18, 2024
b2bd3e0
update README.md
v0xie May 18, 2024
1182ff2
update README.md
v0xie May 18, 2024
8a81cce
update README.md
v0xie May 18, 2024
b45afdb
update README.md
v0xie May 18, 2024
86ffc75
add notice for s-cfg vram reqs
v0xie May 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 93 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,78 @@
# sd-webui-incantations
This extension implements multiple novel algorithms that enhance image quality, prompt following, and more.

## COMPATIBILITY NOTICES:
#### Currently incompatible with stable-diffusion-webui-forge
Use this extension with Forge: https://github.com/pamparamm/sd-perturbed-attention

# Table of Contents
- [What is this?](#what-is-this)
- [Installation](#installation)
- [Compatibility Notice](#compatibility-notice)
- [News](#compatibility-notice)
- [Extension Features](#extension-features)
- [Semantic CFG](#semantic-cfg-s-cfg)
- [Perturbed Attention Guidance](#perturbed-attention-guidance)
- [CFG Scheduler](#cfg-interval--cfg-scheduler)
- [Multi-Concept T2I-Zero](#multi-concept-t2i-zero--attention-regulation)
- [Seek for Incantations](#seek-for-incantations)
- [Tutorial](#tutorial)
- [Other cool extensions](#also-check-out)
- [Credits](#credits)

## What is this?
### This extension for [AUTOMATIC1111/stable-diffusion-webui](https://github.com/AUTOMATIC1111/stable-diffusion-webui) implements algorithms from state-of-the-art research to achieve **higher-quality** images with *more accurate* prompt adherence.

All methods are **training-free** and rely only on modifying the text embeddings or attention maps.


## Installation
To install the `sd-webui-incantations` extension, follow these steps:

0. **Ensure you have the latest Automatic1111 stable-diffusion-webui version ≥ 1.93 installed**

1. **Open the "Extensions" tab and navigate to the "Install from URL" section**:

2. **Paste the repository URL into the "URL for extension's git repository" field**:
```
https://github.com/v0xie/sd-webui-incantations.git
```

3. **Press the Install button**: Wait a few seconds for the extension to finish installing.

4. **Restart the Web UI**:
Completely restart your Stable Diffusion Web UI to load the new extension.

## Compatibility Notice
* Incompatible with **stable-diffusion-webui-forge**: Use this extension with Forge: https://github.com/pamparamm/sd-perturbed-attention
* Reported incompatible with Adetailer: https://github.com/v0xie/sd-webui-incantations/issues/21
* Incompatible with some older webui versions: https://github.com/v0xie/sd-webui-incantations/issues/14
* May conflict with other extensions which modify the CFGDenoiser

## News
- 15-05-2024 🔥 - S-CFG, optimizations for PAG and T2I-Zero, and more! https://github.com/v0xie/sd-webui-incantations/pull/37
- 29-04-2024 🔥 - The implementation of T2I-Zero is fixed and works much more stably now.

# Extension Features

---
## Semantic CFG (S-CFG)
https://arxiv.org/abs/2404.05384
Dynamically rescale CFG guidance per semantic region to a uniform level to improve image / text alignment.
**Very computationally expensive**: A batch size of 4 with 1024x1024 will max out a 24GB 4090.

#### Controls
* **SCFG Scale**: Multiplies the correction by a constant factor. Default: 1.0.
* **SCFG R**: A hyperparameter controlling the factor of cross-attention map refinement. Higher values use more memory and computation time. Default: 4.
* **Rate Min**: The minimum rate that the CFG can be scaled by. Default: 0.8.
* **Rate Max**: The maximum rate that the CFG can be scaled by. Default: 3.0.
* **Clamp Rate**: Overrides Rate Max. Clamps the Max Rate to Clamp Rate / CFG. Default: 0.0.
* **Start Step**: Start S-CFG on this step.
* **End Step**: End S-CFG after this step.

#### Results
Prompt: "A cute puppy on the moon", Min Rate: 0.5, Max Rate: 10.0
- SD 1.5
![image](./images/xyz_grid-0006-1-SCFG.jpg)

* May conflict with extensions that modify the CFGDenoiser
#### Also check out the paper authors' official project repository:
- https://github.com/SmilesDZgk/S-CFG
#### [Return to top](#sd-webui-incantations)

---
## Perturbed Attention Guidance
Expand All @@ -30,7 +95,10 @@ Prompt: "a puppy and a kitten on the moon"
#### Also check out the paper authors' official project page:
- https://ku-cvlab.github.io/Perturbed-Attention-Guidance/

#### [Return to top](#sd-webui-incantations)

---

## CFG Interval / CFG Scheduler
https://arxiv.org/abs/2404.07724 and https://arxiv.org/abs/2404.13040

Expand Down Expand Up @@ -62,6 +130,8 @@ Prompt: "A pointillist painting of a raccoon looking at the sea."
Prompt: "An epic lithograph of a handsome salaryman carefully pouring coffee from a cup into an overflowing carafe, 4K, directed by Wong Kar Wai"
- SD XL
![image](./images/xyz_grid-3380-1-An%20epic%20lithograph%20of%20a%20handsome%20salaryman%20carefully%20pouring%20coffee%20from%20a%20cup%20into%20an%20overflowing%20carafe,%204K,%20directed%20by%20Wong.jpg)

#### [Return to top](#sd-webui-incantations)
---
## Multi-Concept T2I-Zero / Attention Regulation

Expand Down Expand Up @@ -98,6 +168,7 @@ SD XL
- https://multi-concept-t2i-zero.github.io/
- https://github.com/YaNgZhAnG-V5/attention_regulation

#### [Return to top](#sd-webui-incantations)
---
### Seek for Incantations
An incomplete implementation of a "prompt-upsampling" method from https://arxiv.org/abs/2401.06345
Expand All @@ -121,6 +192,7 @@ SD XL
* Modified Prompt: cinematic 4K photo of a dog riding a bus and eating cake and wearing headphones BREAK - - - - - dog - - bus - - - - - -
![image](./images/xyz_grid-2652-1419902843-cinematic%204K%20photo%20of%20a%20dog%20riding%20a%20bus%20and%20eating%20cake%20and%20wearing%20headphones.png)

#### [Return to top](#sd-webui-incantations)
---

### Issues / Pull Requests are welcome!
Expand All @@ -132,6 +204,8 @@ SD XL

[![image](https://cdn-uploads.huggingface.co/production/uploads/6345bd89fe134dfd7a0dba40/TzuZWTiHAc3wTxh3PwGL5.png)](https://youtu.be/lMQ7DIPmrfI)

#### [Return to top](#sd-webui-incantations)

## Also check out:

* **Characteristic Guidance**: Awesome enhancements for sampling at high CFG levels [https://github.com/scraed/CharacteristicGuidanceWebUI](https://github.com/scraed/CharacteristicGuidanceWebUI)
Expand All @@ -144,6 +218,7 @@ SD XL

* **Agent Attention**: Faster image generation and improved image quality with Agent Attention [https://github.com/v0xie/sd-webui-agentattention](https://github.com/v0xie/sd-webui-agentattention)

#### [Return to top](#sd-webui-incantations)
---

### Credits
Expand Down Expand Up @@ -203,9 +278,19 @@ SD XL
primaryClass={cs.CV}
}

@misc{shen2024rethinking,
title={Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance},
author={Dazhong Shen and Guanglu Song and Zeyue Xue and Fu-Yun Wang and Yu Liu},
year={2024},
eprint={2404.05384},
archivePrefix={arXiv},
primaryClass={cs.CV}
}


- Hard Prompts Made Easy (https://github.com/YuxinWenRick/hard-prompts-made-easy)
- [Hard Prompts Made Easy](https://github.com/YuxinWenRick/hard-prompts-made-easy)
- [@udon-universe's extension templates](https://github.com/udon-universe/stable-diffusion-webui-extension-templates)

- @udon-universe's extension templates (https://github.com/udon-universe/stable-diffusion-webui-extension-templates)
#### [Return to top](#sd-webui-incantations)
---

Binary file added images/xyz_grid-0006-1-SCFG.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
237 changes: 237 additions & 0 deletions scripts/cfg_combiner.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,237 @@
import gradio as gr
import logging
import torch
from modules import shared, scripts, devices, patches, script_callbacks
from modules.script_callbacks import CFGDenoiserParams
from modules.processing import StableDiffusionProcessing
from scripts.incantation_base import UIWrapper
from scripts.scfg import scfg_combine_denoised

logger = logging.getLogger(__name__)

class CFGCombinerScript(UIWrapper):
""" Some scripts modify the CFGs in ways that are not compatible with each other.
This script will patch the CFG denoiser function to apply CFG in an ordered way.
This script adds a dict named 'incant_cfg_params' to the processing object.
This dict contains the following:
'denoiser': the denoiser object
'pag_params': list of PAG parameters
'scfg_params': the S-CFG parameters
...
"""
def __init__(self):
pass

# Extension title in menu UI
def title(self):
return "CFG Combiner"

# Decide to show menu in txt2img or img2img
def show(self, is_img2img):
return scripts.AlwaysVisible

# Setup menu ui detail
def setup_ui(self, is_img2img):
self.infotext_fields = []
self.paste_field_names = []
return []

def before_process(self, p: StableDiffusionProcessing, *args, **kwargs):
logger.debug("CFGCombinerScript before_process")
cfg_dict = {
"denoiser": None,
"pag_params": None,
"scfg_params": None
}
setattr(p, 'incant_cfg_params', cfg_dict)

def process(self, p: StableDiffusionProcessing, *args, **kwargs):
pass

def before_process_batch(self, p: StableDiffusionProcessing, *args, **kwargs):
pass

def process_batch(self, p: StableDiffusionProcessing, *args, **kwargs):
""" Process the batch and hook the CFG denoiser if PAG or S-CFG is active """
logger.debug("CFGCombinerScript process_batch")
pag_active = p.extra_generation_params.get('PAG Active', False)
cfg_active = p.extra_generation_params.get('CFG Interval Enable', False)
scfg_active = p.extra_generation_params.get('SCFG Active', False)

if not any([
pag_active,
cfg_active,
scfg_active
]):
return

#logger.debug("CFGCombinerScript process_batch: pag_active or scfg_active")

cfg_denoise_lambda = lambda params: self.on_cfg_denoiser_callback(params, p.incant_cfg_params)
unhook_lambda = lambda: self.unhook_callbacks()

script_callbacks.on_cfg_denoiser(cfg_denoise_lambda)
script_callbacks.on_script_unloaded(unhook_lambda)
logger.debug('Hooked callbacks')

def postprocess_batch(self, p: StableDiffusionProcessing, *args, **kwargs):
logger.debug("CFGCombinerScript postprocess_batch")
script_callbacks.remove_current_script_callbacks()

def unhook_callbacks(self, cfg_dict = None):
if not cfg_dict:
return
self.unpatch_cfg_denoiser(cfg_dict)

def on_cfg_denoiser_callback(self, params: CFGDenoiserParams, cfg_dict: dict):
""" Callback for when the CFG denoiser is called
Patches the combine_denoised function with a custom one.
"""
if cfg_dict['denoiser'] is None:
cfg_dict['denoiser'] = params.denoiser
else:
self.unpatch_cfg_denoiser(cfg_dict)
self.patch_cfg_denoiser(params.denoiser, cfg_dict)

def patch_cfg_denoiser(self, denoiser, cfg_dict: dict):
""" Patch the CFG Denoiser combine_denoised function """
if not cfg_dict:
logger.error("Unable to patch CFG Denoiser, no dict passed as cfg_dict")
return
if not denoiser:
logger.error("Unable to patch CFG Denoiser, denoiser is None")
return

if getattr(denoiser, 'combine_denoised_patched', False) is False:
try:
setattr(denoiser, 'combine_denoised_original', denoiser.combine_denoised)
# create patch that references the original function
pass_conds_func = lambda *args, **kwargs: combine_denoised_pass_conds_list(
*args,
**kwargs,
original_func = denoiser.combine_denoised_original,
pag_params = cfg_dict['pag_params'],
scfg_params = cfg_dict['scfg_params']
)
patched_combine_denoised = patches.patch(__name__, denoiser, "combine_denoised", pass_conds_func)
setattr(denoiser, 'combine_denoised_patched', True)
setattr(denoiser, 'combine_denoised_original', patches.original(__name__, denoiser, "combine_denoised"))
except KeyError:
logger.exception("KeyError patching combine_denoised")
pass
except RuntimeError:
logger.exception("RuntimeError patching combine_denoised")
pass

def unpatch_cfg_denoiser(self, cfg_dict = None):
""" Unpatch the CFG Denoiser combine_denoised function """
if cfg_dict is None:
return
denoiser = cfg_dict.get('denoiser', None)
if denoiser is None:
return

setattr(denoiser, 'combine_denoised_patched', False)
try:
patches.undo(__name__, denoiser, "combine_denoised")
except KeyError:
logger.exception("KeyError unhooking combine_denoised")
pass
except RuntimeError:
logger.exception("RuntimeError unhooking combine_denoised")
pass

cfg_dict['denoiser'] = None


def combine_denoised_pass_conds_list(*args, **kwargs):
""" Hijacked function for combine_denoised in CFGDenoiser
Currently relies on the original function not having any kwargs
If any of the params are not None, it will apply the corresponding guidance
The order of guidance is:
1. CFG and S-CFG are combined multiplicatively
2. PAG guidance is added to the result
3. ...
...
"""
original_func = kwargs.get('original_func', None)
pag_params = kwargs.get('pag_params', None)
scfg_params = kwargs.get('scfg_params', None)

if pag_params is None and scfg_params is None:
logger.warning("No reason to hijack combine_denoised")
return original_func(*args)

def new_combine_denoised(x_out, conds_list, uncond, cond_scale):
denoised_uncond = x_out[-uncond.shape[0]:]
denoised = torch.clone(denoised_uncond)

### Variables
# 0. Standard CFG Value
cfg_scale = cond_scale

# 1. CFG Interval
# Overrides cfg_scale if pag_params is not None
if pag_params is not None:
if pag_params.cfg_interval_enable:
cfg_scale = pag_params.cfg_interval_scheduled_value

# 2. PAG
pag_x_out = None
pag_scale = None
if pag_params is not None:
pag_active = pag_params.pag_active
pag_x_out = pag_params.pag_x_out
pag_scale = pag_params.pag_scale

### Combine Denoised
for i, conds in enumerate(conds_list):
for cond_index, weight in conds:

model_delta = x_out[cond_index] - denoised_uncond[i]

# S-CFG
rate = 1.0
if scfg_params is not None:
rate = scfg_combine_denoised(
model_delta = model_delta,
cfg_scale = cfg_scale,
scfg_params = scfg_params,
)
# If rate is not an int, convert to tensor
if rate is None:
logger.error("scfg_combine_denoised returned None, using default rate of 1.0")
rate = 1.0
elif not isinstance(rate, int) and not isinstance(rate, float):
rate = rate.to(device=shared.device, dtype=model_delta.dtype)
else:
# rate is tensor, probably
pass

# 1. Experimental formulation for S-CFG combined with CFG
denoised[i] += (model_delta) * rate * (weight * cfg_scale)
del rate

# 2. PAG
# PAG is added like CFG
if pag_params is not None:
if not pag_active:
pass
# Not within step interval?
elif not pag_params.pag_start_step <= pag_params.step <= pag_params.pag_end_step:
pass
# Scale is zero?
elif pag_scale <= 0:
pass
# do pag
else:
try:
denoised[i] += (x_out[cond_index] - pag_x_out[i]) * (weight * pag_scale)
except Exception as e:
logger.exception("Exception in combine_denoised_pass_conds_list - %s", e)

#torch.cuda.empty_cache()
devices.torch_gc()

return denoised
return new_combine_denoised(*args)
Loading