Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev2 #1166

Merged
merged 26 commits into from
Jul 11, 2023
Merged

Dev2 #1166

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
66c03be
Fix TE key names for SD1/2 LoRA are invalid
kohya-ss Jul 8, 2023
d599394
support avif
ddPn08 Jul 8, 2023
fe7ede5
fix wrapper tokenizer not work for weighted prompt
kohya-ss Jul 9, 2023
1d25703
add generation script
kohya-ss Jul 9, 2023
8371a7a
update readme
kohya-ss Jul 9, 2023
5f34857
Update sdxl_train.py
KohakuBlueleaf Jul 9, 2023
d974959
Update train_util.py for full_bf16 support
KohakuBlueleaf Jul 9, 2023
7502f66
Merge branch 'sdxl' of https://github.com/kohya-ss/sd-scripts into sdxl
kohya-ss Jul 9, 2023
256ff5b
Merge pull request #626 from ddPn08/sdxl
kohya-ss Jul 9, 2023
3579b45
Merge pull request #628 from KohakuBlueleaf/full_bf16
kohya-ss Jul 9, 2023
0416f26
support multi gpu in caching text encoder outputs
kohya-ss Jul 9, 2023
a380502
fix pad token is not handled
kohya-ss Jul 9, 2023
77ec70d
fix conditioning
kohya-ss Jul 9, 2023
c2ceb6d
fix uncond/cond order
kohya-ss Jul 9, 2023
5c80117
update readme
kohya-ss Jul 9, 2023
b6e328e
don't hold latent on memory for finetuning dataset
kohya-ss Jul 9, 2023
b762ed2
Merge branch 'sdxl' of https://github.com/kohya-ss/sd-scripts into dev2
bmaltais Jul 10, 2023
f54b784
support textual inversion training
kohya-ss Jul 10, 2023
68ca0ea
Fix to show template type
kohya-ss Jul 10, 2023
1ba606c
Merge branch 'sdxl' of https://github.com/kohya-ss/sd-scripts into dev2
bmaltais Jul 10, 2023
2e67d74
add no_half_vae option
kohya-ss Jul 11, 2023
814996b
fix NaN in sampling image
kohya-ss Jul 11, 2023
b114e1f
Merge branch 'sdxl' of https://github.com/kohya-ss/sd-scripts into dev2
bmaltais Jul 11, 2023
689721c
Updates
bmaltais Jul 11, 2023
15c33d9
Update torch choice for windows
bmaltais Jul 11, 2023
b602aa2
Update version
bmaltais Jul 11, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .release
Original file line number Diff line number Diff line change
@@ -1 +1 @@
v21.8.1
v21.8.2
49 changes: 44 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,15 +49,32 @@ The feature of SDXL training is now available in sdxl branch as an experimental
Summary of the feature:

- `sdxl_train.py` is a script for SDXL fine-tuning. The usage is almost the same as `fine_tune.py`, but it also supports DreamBooth dataset.
- `prepare_buckets_latents.py` now supports SDXL fine-tuning.
- `--full_bf16` option is added. Thanks to KohakuBlueleaf!
- This option enables the full bfloat16 training (includes gradients). This option is useful to reduce the GPU memory usage.
- However, bitsandbytes==0.35 doesn't seem to support this. Please use a newer version of bitsandbytes or another optimizer.
- I cannot find bitsandbytes>0.35.0 that works correctly on Windows.
- In addition, the full bfloat16 training might be unstable. Please use it at your own risk.
- `prepare_buckets_latents.py` now supports SDXL fine-tuning.
- `sdxl_train_network.py` is a script for LoRA training for SDXL. The usage is almost the same as `train_network.py`.
- Both scripts has following additional options:
- `--cache_text_encoder_outputs`: Cache the outputs of the text encoders. This option is useful to reduce the GPU memory usage. This option cannot be used with options for shuffling or dropping the captions.
- `--no_half_vae`: Disable the half-precision (mixed-precision) VAE. VAE for SDXL seems to produce NaNs in some cases. This option is useful to avoid the NaNs.
- The image generation during training is now available. However, the VAE for SDXL seems to produce NaNs in some cases when using `fp16`. The images will be black. Currently, the NaNs cannot be avoided even with `--no_half_vae` option. It works with `bf16` or without mixed precision.
- `--weighted_captions` option is not supported yet.

- `--weighted_captions` option is not supported yet for both scripts.
- `--min_timestep` and `--max_timestep` options are added to each training script. These options can be used to train U-Net with different timesteps. The default values are 0 and 1000.

- `sdxl_train_textual_inversion.py` is a script for Textual Inversion training for SDXL. The usage is almost the same as `train_textual_inversion.py`.
- `--cache_text_encoder_outputs` is not supported.
- `token_string` must be alphabet only currently, due to the limitation of the open-clip tokenizer.
- There are two options for captions:
1. Training with captions. All captions must include the token string. The token string is replaced with multiple tokens.
2. Use `--use_object_template` or `--use_style_template` option. The captions are generated from the template. The existing captions are ignored.
- See below for the format of the embeddings.

- `sdxl_gen_img.py` is added. This script can be used to generate images with SDXL, including LoRA. See the help message for the usage.
- Textual Inversion is supported, but the name for the embeds in the caption becomes alphabet only. For example, `neg_hand_v1.safetensors` can be activated with `neghandv`.

`requirements.txt` is updated to support SDXL training.

#### Tips for SDXL training
Expand All @@ -71,16 +88,34 @@ Summary of the feature:
- The LoRA training can be done with 12GB GPU memory.
- `--network_train_unet_only` option is highly recommended for SDXL LoRA. Because SDXL has two text encoders, the result of the training will be unexpected.
- PyTorch 2 seems to use slightly less GPU memory than PyTorch 1.
- `--bucket_reso_steps` can be set to 32 instead of the default value 64. Smaller values than 32 will not work for SDXL training.

Example of the optimizer settings for Adafactor with the fixed learning rate:
```
```toml
optimizer_type = "adafactor"
optimizer_args = [ "scale_parameter=False", "relative_step=False", "warmup_init=False" ]
lr_scheduler = "constant_with_warmup"
lr_warmup_steps = 100
learning_rate = 4e-7 # SDXL original learning rate
```

### Format of Textual Inversion embeddings

```python
from safetensors.torch import save_file

state_dict = {"clip_g": embs_for_text_encoder_1280, "clip_l": embs_for_text_encoder_768}
save_file(state_dict, file)
```

### TODO

- [ ] Support conversion of Diffusers SDXL models.
- [ ] Support `--weighted_captions` option.
- [ ] Change `--output_config` option to continue the training.
- [ ] Extend `--full_bf16` for all the scripts.
- [x] Support Textual Inversion training.

## About requirements.txt

[![LoRA Part 2 Tutorial](https://img.youtube.com/vi/k5imq01uvUY/0.jpg)](https://www.youtube.com/watch?v=k5imq01uvUY)
Expand Down Expand Up @@ -425,6 +460,10 @@ If you come across a `FileNotFoundError`, it is likely due to an installation is

## Change History

* 2023/07/10 (v21.8.1)
* 2023/07/11 (v21.8.2)
- Let Tensorboard works in docker #1137
- Fix for accelerate issue
- Fix for accelerate issue
- Add SDXL TI training support
- Rework gui for common layout
- More LoRA tools to class
- Add no_half_vae option to TI
Loading
Loading