Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when training Dreambooth LoRA #1087

Closed
Starbaby8 opened this issue Jun 27, 2023 · 2 comments
Closed

Error when training Dreambooth LoRA #1087

Starbaby8 opened this issue Jun 27, 2023 · 2 comments

Comments

@Starbaby8
Copy link

OS: Windows 11
GPU: RTX 4090

Training 400 images

Attempting to train results in an error:

12:10:56-411744 INFO     Start training LoRA Standard ...
12:10:56-412743 INFO     Valid image folder names found in: C:/Users/nicol/Documents/Taocah_Training/Lora/Images/img
12:10:56-414744 INFO     Folder 15_taoillust Illustrative Style: 394 images found
12:10:56-415744 INFO     Folder 15_taoillust Illustrative Style: 5910 steps
12:10:56-415744 INFO     Total steps: 5910
12:10:56-416744 INFO     Train batch size: 1
12:10:56-416744 INFO     Gradient accumulation steps: 1
12:10:56-417744 INFO     Epoch: 10
12:10:56-417744 INFO     Regulatization factor: 1
12:10:56-418743 INFO     max_train_steps (5910 / 1 / 1 * 10 * 1) = 59100
12:10:56-418743 INFO     stop_text_encoder_training = 0
12:10:56-419793 INFO     lr_warmup_steps = 5910
12:10:56-420793 INFO     accelerate launch --num_cpu_threads_per_process=2 "train_network.py" --enable_bucket
                         --pretrained_model_name_or_path="C:/Users/nicol/Documents/Ai/stable-diffusion-webui/models/Stab
                         le-diffusion/v1-5-pruned.safetensors"
                         --train_data_dir="C:/Users/nicol/Documents/Taocah_Training/Lora/Images/img"
                         --resolution="768x768"
                         --output_dir="C:/Users/nicol/Documents/Taocah_Training/Lora/Images/model"
                         --logging_dir="C:/Users/nicol/Documents/Taocah_Training/Lora/Images/log" --network_alpha="200"
                         --save_model_as=safetensors --network_module=networks.lora --text_encoder_lr=5e-05
                         --unet_lr=0.0001 --network_dim=200 --output_name="last" --lr_scheduler_num_cycles="10"
                         --learning_rate="0.0001" --lr_scheduler="cosine" --lr_warmup_steps="5910"
                         --train_batch_size="1" --max_train_steps="59100" --save_every_n_epochs="1"
                         --mixed_precision="fp16" --save_precision="fp16" --cache_latents --optimizer_type="AdamW8bit"
                         --max_data_loader_n_workers="0" --bucket_reso_steps=64 --xformers --bucket_no_upscale
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\nicol\Documents\Ai\Khoya\kohya_ss\train_network.py:873 in <module>                      │
│                                                                                                  │
│   870 │   args = parser.parse_args()                                                             │
│   871 │   args = train_util.read_config_from_file(args, parser)                                  │
│   872 │                                                                                          │
│ ❱ 873 │   train(args)                                                                            │
│   874                                                                                            │
│                                                                                                  │
│ C:\Users\nicol\Documents\Ai\Khoya\kohya_ss\train_network.py:82 in train                          │
│                                                                                                  │
│    79 │   session_id = random.randint(0, 2**32)                                                  │
│    80 │   training_started_at = time.time()                                                      │
│    81 │   train_util.verify_training_args(args)                                                  │
│ ❱  82 │   train_util.prepare_dataset_args(args, True)                                            │
│    83 │                                                                                          │
│    84 │   cache_latents = args.cache_latents                                                     │
│    85 │   use_dreambooth_method = args.in_json is None                                           │
│                                                                                                  │
│ C:\Users\nicol\Documents\Ai\Khoya\kohya_ss\library\train_util.py:2991 in prepare_dataset_args    │
│                                                                                                  │
│   2988 │                                                                                         │
│   2989 │   # assert args.resolution is not None, f"resolution is required / resolution(解像度   │
│   2990 │   if args.resolution is not None:                                                       │
│ ❱ 2991 │   │   args.resolution = tuple([int(r) for r in args.resolution.split(",")])             │
│   2992 │   │   if len(args.resolution) == 1:                                                     │
│   2993 │   │   │   args.resolution = (args.resolution[0], args.resolution[0])                    │
│   2994 │   │   assert (                                                                          │
│                                                                                                  │
│ C:\Users\nicol\Documents\Ai\Khoya\kohya_ss\library\train_util.py:2991 in <listcomp>              │
│                                                                                                  │
│   2988 │                                                                                         │
│   2989 │   # assert args.resolution is not None, f"resolution is required / resolution(解像度   │
│   2990 │   if args.resolution is not None:                                                       │
│ ❱ 2991 │   │   args.resolution = tuple([int(r) for r in args.resolution.split(",")])             │
│   2992 │   │   if len(args.resolution) == 1:                                                     │
│   2993 │   │   │   args.resolution = (args.resolution[0], args.resolution[0])                    │
│   2994 │   │   assert (                                                                          │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: invalid literal for int() with base 10: '768x768'
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ C:\Users\nicol\AppData\Local\Programs\Python\Python310\lib\runpy.py:196 in _run_module_as_main   │
│                                                                                                  │
│   193 │   main_globals = sys.modules["__main__"].__dict__                                        │
│   194 │   if alter_argv:                                                                         │
│   195 │   │   sys.argv[0] = mod_spec.origin                                                      │
│ ❱ 196 │   return _run_code(code, main_globals, None,                                             │
│   197 │   │   │   │   │    "__main__", mod_spec)                                                 │
│   198                                                                                            │
│   199 def run_module(mod_name, init_globals=None,                                                │
│                                                                                                  │
│ C:\Users\nicol\AppData\Local\Programs\Python\Python310\lib\runpy.py:86 in _run_code              │
│                                                                                                  │
│    83 │   │   │   │   │      __loader__ = loader,                                                │
│    84 │   │   │   │   │      __package__ = pkg_name,                                             │
│    85 │   │   │   │   │      __spec__ = mod_spec)                                                │
│ ❱  86 │   exec(code, run_globals)                                                                │
│    87 │   return run_globals                                                                     │
│    88                                                                                            │
│    89 def _run_module_code(code, init_globals=None,                                              │
│                                                                                                  │
│ in <module>:7                                                                                    │
│                                                                                                  │
│   4 from accelerate.commands.accelerate_cli import main                                          │
│   5 if __name__ == '__main__':                                                                   │
│   6 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])                         │
│ ❱ 7 │   sys.exit(main())                                                                         │
│   8                                                                                              │
│                                                                                                  │
│ C:\Users\nicol\Documents\Ai\Khoya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate │
│ _cli.py:45 in main                                                                               │
│                                                                                                  │
│   42 │   │   exit(1)                                                                             │
│   43 │                                                                                           │
│   44 │   # Run                                                                                   │
│ ❱ 45 │   args.func(args)                                                                         │
│   46                                                                                             │
│   47                                                                                             │
│   48 if __name__ == "__main__":                                                                  │
│                                                                                                  │
│ C:\Users\nicol\Documents\Ai\Khoya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py: │
│ 918 in launch_command                                                                            │
│                                                                                                  │
│   915 │   elif defaults is not None and defaults.compute_environment == ComputeEnvironment.AMA   │
│   916 │   │   sagemaker_launcher(defaults, args)                                                 │
│   917 │   else:                                                                                  │
│ ❱ 918 │   │   simple_launcher(args)                                                              │
│   919                                                                                            │
│   920                                                                                            │
│   921 def main():                                                                                │
│                                                                                                  │
│ C:\Users\nicol\Documents\Ai\Khoya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py: │
│ 580 in simple_launcher                                                                           │
│                                                                                                  │
│   577 │   process.wait()                                                                         │
│   578 │   if process.returncode != 0:                                                            │
│   579 │   │   if not args.quiet:                                                                 │
│ ❱ 580 │   │   │   raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)    │
│   581 │   │   else:                                                                              │
│   582 │   │   │   sys.exit(1)                                                                    │
│   583                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['C:\\Users\\nicol\\Documents\\Ai\\Khoya\\kohya_ss\\venv\\Scripts\\python.exe',
'train_network.py', '--enable_bucket',
'--pretrained_model_name_or_path=C:/Users/nicol/Documents/Ai/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned.
safetensors', '--train_data_dir=C:/Users/nicol/Documents/Taocah_Training/Lora/Images/img', '--resolution=768x768',
'--output_dir=C:/Users/nicol/Documents/Taocah_Training/Lora/Images/model',
'--logging_dir=C:/Users/nicol/Documents/Taocah_Training/Lora/Images/log', '--network_alpha=200',
'--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-05', '--unet_lr=0.0001',
'--network_dim=200', '--output_name=last', '--lr_scheduler_num_cycles=10', '--learning_rate=0.0001',
'--lr_scheduler=cosine', '--lr_warmup_steps=5910', '--train_batch_size=1', '--max_train_steps=59100',
'--save_every_n_epochs=1', '--mixed_precision=fp16', '--save_precision=fp16', '--cache_latents',
'--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--xformers',
'--bucket_no_upscale']' returned non-zero exit status 1.

I have the folder directories set up:

image

image

In the img folder contains
image
which has all the images for training captioned by blip aswell.

here are my training parameters:

image

image

Any help would be greatly appreciated

@bmaltais
Copy link
Owner

bmaltais commented Jun 27, 2023

Use 768,768 for the max resolution... Not 769x768

I know... Weird but it need a comma instead of an x

@dfghsderftgerdf
Copy link

try this it helped me, I had the same issue download the latest cuda toolkit:

https://developer.nvidia.com/cuda-downloads?target_os=Windows&target_arch=x86_64&target_version=10&target_type=exe_local

bmaltais pushed a commit that referenced this issue Feb 4, 2024
fix broken import in svd_merge_lora script
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants