Error when training Dreambooth LoRA #1087

Starbaby8 · 2023-06-27T16:22:06Z

OS: Windows 11
GPU: RTX 4090

Training 400 images

Attempting to train results in an error:

12:10:56-411744 INFO     Start training LoRA Standard ...
12:10:56-412743 INFO     Valid image folder names found in: C:/Users/nicol/Documents/Taocah_Training/Lora/Images/img
12:10:56-414744 INFO     Folder 15_taoillust Illustrative Style: 394 images found
12:10:56-415744 INFO     Folder 15_taoillust Illustrative Style: 5910 steps
12:10:56-415744 INFO     Total steps: 5910
12:10:56-416744 INFO     Train batch size: 1
12:10:56-416744 INFO     Gradient accumulation steps: 1
12:10:56-417744 INFO     Epoch: 10
12:10:56-417744 INFO     Regulatization factor: 1
12:10:56-418743 INFO     max_train_steps (5910 / 1 / 1 * 10 * 1) = 59100
12:10:56-418743 INFO     stop_text_encoder_training = 0
12:10:56-419793 INFO     lr_warmup_steps = 5910
12:10:56-420793 INFO     accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --enable_bucket
                         --pretrained_model_name_or_path="C:/Users/nicol/Documents/Ai/stable-diffusion-webui/models/Stab
                         le-diffusion/v1-5-pruned.safetensors"
                         --train_data_dir="C:/Users/nicol/Documents/Taocah_Training/Lora/Images/img"
                         --resolution="768x768"
                         --output_dir="C:/Users/nicol/Documents/Taocah_Training/Lora/Images/model"
                         --logging_dir="C:/Users/nicol/Documents/Taocah_Training/Lora/Images/log" --network_alpha="200"
                         --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-05
                         --unet_lr=0.0001 --network_dim=200 --output_name="last" --lr_scheduler_num_cycles="10"
                         --learning_rate="0.0001" --lr_scheduler="cosine" --lr_warmup_steps="5910"
                         --train_batch_size="1" --max_train_steps="59100" --save_every_n_epochs="1"
                         --mixed_precision="fp16" --save_precision="fp16" --cache_latents --optimizer_type="AdamW8bit"
                         --max_data_loader_n_workers="0" --bucket_reso_steps=64 --xformers --bucket_no_upscale
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\nicol\Documents\Ai\Khoya\kohya_ss\train_network.py:873 in <module>                      │
│                                                                                                  │
│   870 │   args = parser.parse_args()                                                             │
│   871 │   args = train_util.read_config_from_file(args, parser)                                  │
│   872 │                                                                                          │
│ ❱ 873 │   train(args)                                                                            │
│   874                                                                                            │
│                                                                                                  │
│ C:\Users\nicol\Documents\Ai\Khoya\kohya_ss\train_network.py:82 in train                          │
│                                                                                                  │
│    79 │   session_id = random.randint(0, 2**32)                                                  │
│    80 │   training_started_at = time.time()                                                      │
│    81 │   train_util.verify_training_args(args)                                                  │
│ ❱  82 │   train_util.prepare_dataset_args(args, True)                                            │
│    83 │                                                                                          │
│    84 │   cache_latents = args.cache_latents                                                     │
│    85 │   use_dreambooth_method = args.in_json is None                                           │
│                                                                                                  │
│ C:\Users\nicol\Documents\Ai\Khoya\kohya_ss\library\train_util.py:2991 in prepare_dataset_args    │
│                                                                                                  │
│   2988 │                                                                                         │
│   2989 │   # assert args.resolution is not None, f"resolution is required / resolution（解像度   │
│   2990 │   if args.resolution is not None:                                                       │
│ ❱ 2991 │   │   args.resolution = tuple([int(r) for r in args.resolution.split(",")])             │
│   2992 │   │   if len(args.resolution) == 1:                                                     │
│   2993 │   │   │   args.resolution = (args.resolution[0], args.resolution[0])                    │
│   2994 │   │   assert (                                                                          │
│                                                                                                  │
│ C:\Users\nicol\Documents\Ai\Khoya\kohya_ss\library\train_util.py:2991 in <listcomp>              │
│                                                                                                  │
│   2988 │                                                                                         │
│   2989 │   # assert args.resolution is not None, f"resolution is required / resolution（解像度   │
│   2990 │   if args.resolution is not None:                                                       │
│ ❱ 2991 │   │   args.resolution = tuple([int(r) for r in args.resolution.split(",")])             │
│   2992 │   │   if len(args.resolution) == 1:                                                     │
│   2993 │   │   │   args.resolution = (args.resolution[0], args.resolution[0])                    │
│   2994 │   │   assert (                                                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: invalid literal for int() with base 10: '768x768'
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\nicol\AppData\Local\Programs\Python\Python310\lib\runpy.py:196 in _run_module_as_main   │
│                                                                                                  │
│   193 │   main_globals = sys.modules["__main__"].__dict__                                        │
│   194 │   if alter_argv:                                                                         │
│   195 │   │   sys.argv[0] = mod_spec.origin                                                      │
│ ❱ 196 │   return _run_code(code, main_globals, None,                                             │
│   197 │   │   │   │   │    "__main__", mod_spec)                                                 │
│   198                                                                                            │
│   199 def run_module(mod_name, init_globals=None,                                                │
│                                                                                                  │
│ C:\Users\nicol\AppData\Local\Programs\Python\Python310\lib\runpy.py:86 in _run_code              │
│                                                                                                  │
│    83 │   │   │   │   │      __loader__ = loader,                                                │
│    84 │   │   │   │   │      __package__ = pkg_name,                                             │
│    85 │   │   │   │   │      __spec__ = mod_spec)                                                │
│ ❱  86 │   exec(code, run_globals)                                                                │
│    87 │   return run_globals                                                                     │
│    88                                                                                            │
│    89 def _run_module_code(code, init_globals=None,                                              │
│                                                                                                  │
│ in <module>:7                                                                                    │
│                                                                                                  │
│   4 from accelerate.commands.accelerate_cli import main                                          │
│   5 if __name__ == '__main__':                                                                   │
│   6 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])                         │
│ ❱ 7 │   sys.exit(main())                                                                         │
│   8                                                                                              │
│                                                                                                  │
│ C:\Users\nicol\Documents\Ai\Khoya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate │
│ _cli.py:45 in main                                                                               │
│                                                                                                  │
│   42 │   │   exit(1)                                                                             │
│   43 │                                                                                           │
│   44 │   # Run                                                                                   │
│ ❱ 45 │   args.func(args)                                                                         │
│   46                                                                                             │
│   47                                                                                             │
│   48 if __name__ == "__main__":                                                                  │
│                                                                                                  │
│ C:\Users\nicol\Documents\Ai\Khoya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py: │
│ 918 in launch_command                                                                            │
│                                                                                                  │
│   915 │   elif defaults is not None and defaults.compute_environment == ComputeEnvironment.AMA   │
│   916 │   │   sagemaker_launcher(defaults, args)                                                 │
│   917 │   else:                                                                                  │
│ ❱ 918 │   │   simple_launcher(args)                                                              │
│   919                                                                                            │
│   920                                                                                            │
│   921 def main():                                                                                │
│                                                                                                  │
│ C:\Users\nicol\Documents\Ai\Khoya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py: │
│ 580 in simple_launcher                                                                           │
│                                                                                                  │
│   577 │   process.wait()                                                                         │
│   578 │   if process.returncode != 0:                                                            │
│   579 │   │   if not args.quiet:                                                                 │
│ ❱ 580 │   │   │   raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)    │
│   581 │   │   else:                                                                              │
│   582 │   │   │   sys.exit(1)                                                                    │
│   583                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['C:\\Users\\nicol\\Documents\\Ai\\Khoya\\kohya_ss\\venv\\Scripts\\python.exe',
'train_network.py', '--enable_bucket',
'--pretrained_model_name_or_path=C:/Users/nicol/Documents/Ai/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned.
safetensors', '--train_data_dir=C:/Users/nicol/Documents/Taocah_Training/Lora/Images/img', '--resolution=768x768',
'--output_dir=C:/Users/nicol/Documents/Taocah_Training/Lora/Images/model',
'--logging_dir=C:/Users/nicol/Documents/Taocah_Training/Lora/Images/log', '--network_alpha=200',
'--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-05', '--unet_lr=0.0001',
'--network_dim=200', '--output_name=last', '--lr_scheduler_num_cycles=10', '--learning_rate=0.0001',
'--lr_scheduler=cosine', '--lr_warmup_steps=5910', '--train_batch_size=1', '--max_train_steps=59100',
'--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--cache_latents',
'--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--xformers',
'--bucket_no_upscale']' returned non-zero exit status 1.

I have the folder directories set up:

In the img folder contains

which has all the images for training captioned by blip aswell.

here are my training parameters:

Any help would be greatly appreciated

The text was updated successfully, but these errors were encountered:

bmaltais · 2023-06-27T21:32:33Z

Use 768,768 for the max resolution... Not 769x768

I know... Weird but it need a comma instead of an x

dfghsderftgerdf · 2023-07-04T07:33:16Z

try this it helped me, I had the same issue download the latest cuda toolkit:

https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64&target_version=10&target_type=exe_local

fix broken import in svd_merge_lora script

bmaltais closed this as completed Jan 29, 2024

bmaltais pushed a commit that referenced this issue Feb 4, 2024

Merge pull request #1087 from mgz-dev/fix-imports-on-svd_merge_lora

7f948db

fix broken import in svd_merge_lora script

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when training Dreambooth LoRA #1087

Error when training Dreambooth LoRA #1087

Starbaby8 commented Jun 27, 2023

bmaltais commented Jun 27, 2023 •

edited

Loading

dfghsderftgerdf commented Jul 4, 2023

Error when training Dreambooth LoRA #1087

Error when training Dreambooth LoRA #1087

Comments

Starbaby8 commented Jun 27, 2023

bmaltais commented Jun 27, 2023 • edited Loading

dfghsderftgerdf commented Jul 4, 2023

bmaltais commented Jun 27, 2023 •

edited

Loading