accelerate wrong #1084

Remosy · 2023-06-27T08:59:40Z

The system cannot find the path specified.
16:51:55-883076 ERROR Not a git repository
16:51:55-908027 INFO nVidia toolkit detected
16:51:57-295309 INFO Torch 2.0.1+cu118
16:51:57-319374 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8700
16:51:57-321346 INFO Torch detected GPU: NVIDIA GeForce RTX 2080 Ti VRAM 11264 Arch (7, 5) Cores 68
16:51:57-323338 INFO Verifying modules instalation status from requirements_windows_torch2.txt...
16:51:57-325333 INFO Verifying modules instalation status from requirements.txt...
16:51:59-921632 INFO headless: False
16:51:59-931603 INFO Load CSS...
Running on local URL: http://127.0.0.1:7860

I used today newest code V21.7.16.

I ran gui.bat --listen 127.0.0.1 --server_port 7860 --inbrowser under my conda environment. My accelerate version is 19, python version is 3.10

I got bugs as following:

WARNING The following values were not passed to accelerate launch and had defaults used instead: launch.py:890
--num_processes was set to a value of 1
--num_machines was set to a value of 1
--mixed_precision was set to a value of 'no'
--dynamo_backend was set to a value of 'no'
To avoid this warning pass in values for each of the problematic parameters or run accelerate config.
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause i
ncorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP r
untime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to
continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/su
pport/.

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ F:\anaconda3\lib\runpy.py:196 in _run_module_as_main                                             │
│                                                                                                  │
│   193 │   main_globals = sys.modules["__main__"].__dict__                                        │
│   194 │   if alter_argv:                                                                         │
│   195 │   │   sys.argv[0] = mod_spec.origin                                                      │
│ ❱ 196 │   return _run_code(code, main_globals, None,                                             │
│   197 │   │   │   │   │    "__main__", mod_spec)                                                 │
│   198                                                                                            │
│   199 def run_module(mod_name, init_globals=None,                                                │
│                                                                                                  │
│ F:\anaconda3\lib\runpy.py:86 in _run_code                                                        │
│                                                                                                  │
│    83 │   │   │   │   │      __loader__ = loader,                                                │
│    84 │   │   │   │   │      __package__ = pkg_name,                                             │
│    85 │   │   │   │   │      __spec__ = mod_spec)                                                │
│ ❱  86 │   exec(code, run_globals)                                                                │
│    87 │   return run_globals                                                                     │
│    88                                                                                            │
│    89 def _run_module_code(code, init_globals=None,                                              │
│                                                                                                  │
│ in <module>:7                                                                                    │
│                                                                                                  │
│   4 from accelerate.commands.accelerate_cli import main                                          │
│   5 if __name__ == '__main__':                                                                   │
│   6 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])                         │
│ ❱ 7 │   sys.exit(main())                                                                         │
│   8                                                                                              │
│                                                                                                  │
│ F:\anaconda3\lib\site-packages\accelerate\commands\accelerate_cli.py:45 in main                  │
│                                                                                                  │
│   42 │   │   exit(1)                                                                             │
│   43 │                                                                                           │
│   44 │   # Run                                                                                   │
│ ❱ 45 │   args.func(args)                                                                         │
│   46                                                                                             │
│   47                                                                                             │
│   48 if __name__ == "__main__":                                                                  │
│                                                                                                  │
│                                                                                                  │
│   577 │   process.wait()                                                                         │
│   578 │   if process.returncode != 0:                                                            │
│   579 │   │   if ess.returncode != 0:                                                            │
│   579 │   │   if not args.quiet:                                                                 │
│ ❱ 580 │   │   │   raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)    │
│   581 │   │   :                                                                              │
│   582 │   │   │   sys.exit(1)                                                                    │
│   583                                                                                            │
│                                                                                                  │
│   577 │   process.wait()                                                                         │
│   578 │   if process.returncode != 0:                                                            │
│   579 │   │   if not args.quiet:                                                                 │
│ ❱ 580 │   │   │   raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)    │
│   581 │   │   else:                                                                              │
│   582 │   │   │   sys.exit(1)                                                                    │
│   583                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['F:\\anaconda3\\python.exe', 'train_network.py', '--enable_bucket',
'--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=C:\\Users\\Remi\\Desktop\\dongxu', '--resolution=450,800',
'--output_dir=E:\\sd.webui\\webui\\embeddings', '--logging_dir=C:\\Users\\Remi\\Desktop\\dongxu\\logs', '--network_alpha=1', '--save_model_as=safetensors',      
'--network_module=networks.lora', '--text_encoder_lr=5e-05', '--unet_lr=0.0001', '--network_dim=8', '--output_name=dongxu_try1', '--lr_scheduler_num_cycles=1',  
'--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=6', '--train_batch_size=1', '--max_train_steps=56', '--save_every_n_epochs=1',
│                                                                                                  │
│   577 │   process.wait()                                                                         │
│   578 │   if process.returncode != 0:                                                            │
│                                                                                                  │
│   577 │   process.wait()                                                                         │
│   578 │   if process.returncode != 0:                                                            │
│   579 │   │   if not args.quiet:                                                                 │
│ ❱ 580 │   │   │   raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)    │
│   581 │   │   else:                                                                              │
│   582 │   │   │   sys.exit(1)                                                                    │
│   583                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['F:\\anaconda3\\python.exe', 'train_network.py', '--enable_bucket',
'--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=C:\\Users\\Remi\\Desktop\\dongxu', '--resolution=450,800',
'--output_dir=E:\\sd.webui\\webui\\embeddings', '--logging_dir=C:\\Users\\Remi\\Desktop\\dongxu\\logs', '--network_alpha=1', '--save_model_as=safetensors',      
'--network_module=networks.lora', '--text_encoder_lr=5e-05', '--unet_lr=0.0001', '--network_dim=8', '--output_name=dongxu_try1', '--lr_scheduler_num_cycles=1',  
'--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=6', '--train_batch_size=1', '--max_train_steps=56', '--save_every_n_epochs=1',
'--mixed_precision=fp16', '--save_precision=fp16', '--seed=313', '--caption_extension=.txt', '--cache_latents', '--cache_latents_to_disk',
'--optimizer_type=Lion', '--max_train_epochs=10', '--bucket_reso_steps=64', '--save_state', '--xformers', '--bucket_no_upscale', "--mixed_precision'fp16'"]'     
returned non-zero exit status 3.

The text was updated successfully, but these errors were encountered:

bmaltais · 2023-06-27T09:24:36Z

Try running accelerate config manually

Fix network multiplier cause crashed while use multi-GPUs

Remosy closed this as completed Sep 3, 2023

bmaltais pushed a commit that referenced this issue Jan 27, 2024

Merge pull request #1084 from fireicewolf/devel

930a391

Fix network multiplier cause crashed while use multi-GPUs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

accelerate wrong #1084

accelerate wrong #1084

Remosy commented Jun 27, 2023

bmaltais commented Jun 27, 2023

accelerate wrong #1084

accelerate wrong #1084

Comments

Remosy commented Jun 27, 2023

I got bugs as following:

bmaltais commented Jun 27, 2023