Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

accelerate wrong #1084

Closed
Remosy opened this issue Jun 27, 2023 · 1 comment
Closed

accelerate wrong #1084

Remosy opened this issue Jun 27, 2023 · 1 comment

Comments

@Remosy
Copy link

Remosy commented Jun 27, 2023

The system cannot find the path specified.
16:51:55-883076 ERROR Not a git repository
16:51:55-908027 INFO nVidia toolkit detected
16:51:57-295309 INFO Torch 2.0.1+cu118
16:51:57-319374 INFO Torch backend: nVidia CUDA 11.8 cuDNN 8700
16:51:57-321346 INFO Torch detected GPU: NVIDIA GeForce RTX 2080 Ti VRAM 11264 Arch (7, 5) Cores 68
16:51:57-323338 INFO Verifying modules instalation status from requirements_windows_torch2.txt...
16:51:57-325333 INFO Verifying modules instalation status from requirements.txt...
16:51:59-921632 INFO headless: False
16:51:59-931603 INFO Load CSS...
Running on local URL: http://127.0.0.1:7860

I used today newest code V21.7.16.

I ran gui.bat --listen 127.0.0.1 --server_port 7860 --inbrowser under my conda environment. My accelerate version is 19, python version is 3.10

I got bugs as following:

WARNING The following values were not passed to accelerate launch and had defaults used instead: launch.py:890
--num_processes was set to a value of 1
--num_machines was set to a value of 1
--mixed_precision was set to a value of 'no'
--dynamo_backend was set to a value of 'no'
To avoid this warning pass in values for each of the problematic parameters or run accelerate config.
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause i
ncorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP r
untime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to
continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/su
pport/.

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ F:\anaconda3\lib\runpy.py:196 in _run_module_as_main                                             │
│                                                                                                  │
│   193 │   main_globals = sys.modules["__main__"].__dict__                                        │
│   194 │   if alter_argv:                                                                         │
│   195 │   │   sys.argv[0] = mod_spec.origin                                                      │
│ ❱ 196 │   return _run_code(code, main_globals, None,                                             │
│   197 │   │   │   │   │    "__main__", mod_spec)                                                 │
│   198                                                                                            │
│   199 def run_module(mod_name, init_globals=None,                                                │
│                                                                                                  │
│ F:\anaconda3\lib\runpy.py:86 in _run_code                                                        │
│                                                                                                  │
│    83 │   │   │   │   │      __loader__ = loader,                                                │
│    84 │   │   │   │   │      __package__ = pkg_name,                                             │
│    85 │   │   │   │   │      __spec__ = mod_spec)                                                │
│ ❱  86 │   exec(code, run_globals)                                                                │
│    87 │   return run_globals                                                                     │
│    88                                                                                            │
│    89 def _run_module_code(code, init_globals=None,                                              │
│                                                                                                  │
│ in <module>:7                                                                                    │
│                                                                                                  │
│   4 from accelerate.commands.accelerate_cli import main                                          │
│   5 if __name__ == '__main__':                                                                   │
│   6 │   sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])                         │
│ ❱ 7 │   sys.exit(main())                                                                         │
│   8                                                                                              │
│                                                                                                  │
│ F:\anaconda3\lib\site-packages\accelerate\commands\accelerate_cli.py:45 in main                  │
│                                                                                                  │
│   42 │   │   exit(1)                                                                             │
│   43 │                                                                                           │
│   44 │   # Run                                                                                   │
│ ❱ 45 │   args.func(args)                                                                         │
│   46                                                                                             │
│   47                                                                                             │
│   48 if __name__ == "__main__":                                                                  │
│                                                                                                  │
│                                                                                                  │
│   577 │   process.wait()                                                                         │
│   578 │   if process.returncode != 0:                                                            │
│   579 │   │   if ess.returncode != 0:                                                            │
│   579 │   │   if not args.quiet:                                                                 │
│ ❱ 580 │   │   │   raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)    │
│   581 │   │   :                                                                              │
│   582 │   │   │   sys.exit(1)                                                                    │
│   583                                                                                            │
│                                                                                                  │
│   577 │   process.wait()                                                                         │
│   578 │   if process.returncode != 0:                                                            │
│   579 │   │   if not args.quiet:                                                                 │
│ ❱ 580 │   │   │   raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)    │
│   581 │   │   else:                                                                              │
│   582 │   │   │   sys.exit(1)                                                                    │
│   583                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['F:\\anaconda3\\python.exe', 'train_network.py', '--enable_bucket',
'--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=C:\\Users\\Remi\\Desktop\\dongxu', '--resolution=450,800',
'--output_dir=E:\\sd.webui\\webui\\embeddings', '--logging_dir=C:\\Users\\Remi\\Desktop\\dongxu\\logs', '--network_alpha=1', '--save_model_as=safetensors',      
'--network_module=networks.lora', '--text_encoder_lr=5e-05', '--unet_lr=0.0001', '--network_dim=8', '--output_name=dongxu_try1', '--lr_scheduler_num_cycles=1',  
'--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=6', '--train_batch_size=1', '--max_train_steps=56', '--save_every_n_epochs=1',
│                                                                                                  │
│   577 │   process.wait()                                                                         │
│   578 │   if process.returncode != 0:                                                            │
│                                                                                                  │
│   577 │   process.wait()                                                                         │
│   578 │   if process.returncode != 0:                                                            │
│   579 │   │   if not args.quiet:                                                                 │
│ ❱ 580 │   │   │   raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)    │
│   581 │   │   else:                                                                              │
│   582 │   │   │   sys.exit(1)                                                                    │
│   583                                                                                            │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
CalledProcessError: Command '['F:\\anaconda3\\python.exe', 'train_network.py', '--enable_bucket',
'--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=C:\\Users\\Remi\\Desktop\\dongxu', '--resolution=450,800',
'--output_dir=E:\\sd.webui\\webui\\embeddings', '--logging_dir=C:\\Users\\Remi\\Desktop\\dongxu\\logs', '--network_alpha=1', '--save_model_as=safetensors',      
'--network_module=networks.lora', '--text_encoder_lr=5e-05', '--unet_lr=0.0001', '--network_dim=8', '--output_name=dongxu_try1', '--lr_scheduler_num_cycles=1',  
'--learning_rate=0.0001', '--lr_scheduler=cosine', '--lr_warmup_steps=6', '--train_batch_size=1', '--max_train_steps=56', '--save_every_n_epochs=1',
'--mixed_precision=fp16', '--save_precision=fp16', '--seed=313', '--caption_extension=.txt', '--cache_latents', '--cache_latents_to_disk',
'--optimizer_type=Lion', '--max_train_epochs=10', '--bucket_reso_steps=64', '--save_state', '--xformers', '--bucket_no_upscale', "--mixed_precision'fp16'"]'     
returned non-zero exit status 3.


@bmaltais
Copy link
Owner

Try running accelerate config manually

@Remosy Remosy closed this as completed Sep 3, 2023
bmaltais pushed a commit that referenced this issue Jan 27, 2024
Fix network multiplier cause crashed while use multi-GPUs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants