Is it possible to implement Accelerate from Hugging Face? #700

Omegadarling · 2023-01-05T15:23:41Z

WHAT
'Accelerate' is for training with multiple GPUs.

WHERE
https://huggingface.co/docs/transformers/accelerate

HOW
I'm just an artist who has a healthy appreciation for coding, but yeah, check the URL.

WHEN
Well, I have a 10 x 3090 GPU machine and it might be nice to train models faster.

78Alpha · 2023-01-05T17:37:50Z

Have you tried setting ACCELERATE=true for the webui?

d8ahazard · 2023-01-06T15:50:29Z

This was implemented quite a while ago...

Omegadarling · 2023-01-06T16:03:36Z

@d8ahazard That's great and I kind of see it "working" after I added set ACCELERATE="True" to my webui-user.bat, but do you know of anywhere that someone explains how to do this with Automatic1111?

78Alpha · 2023-01-12T18:43:55Z

Accelerate itself has very little online that can be searched up. Here is the pull where it was added, that is the well of info for the time being.

AUTOMATIC1111/stable-diffusion-webui#4527

Omegadarling · 2023-01-13T15:55:22Z

I've gotten it further by running pip install requirements.txt on Automatic1111, but then when I ran with ACCELERATE="True" again I got some very worrying error messages. Here's my log from Windows PowerShell:
SD_accelerateTrue_log_2023-01-11-1823.txt

It does appear that Accelerate is starting to do some things to split out onto the 8 GPUs that are currently installed (pulled two for a workstation, but when I get Accelerate working I'll be putting those back in).

But there's some very suspicious attempts to connect to [www.007guard.com]:29500 and I looked up the commit hash that gets posted after that and it's also referencing this 007guard website that no longer exists. This only comes up when I have Accelerate enabled and I don't see that error message when I run normally on one GPU.

I use this 10 GPU beast of a machine to do my day job (lot of 3D rendering) but it's so frustrating to have all that untapped power that could be used to make some really deep Stable Diffusion models and embeddings!

78Alpha · 2023-01-13T17:19:24Z

For the 007guard, take a look at this for some info https://superuser.com/questions/706729/007guard-what-is-it-is-it-dangerous-and-can-it-be-removed

Omegadarling · 2023-01-13T17:24:44Z

For the 007guard, take a look at this for some info https://superuser.com/questions/706729/007guard-what-is-it-is-it-dangerous-and-can-it-be-removed

Wow. I do have Spybot running, so that's probably spot on! I'm uncommenting the localhost line, but there is a line above that says # localhost name resolution is handled within DNS itself. so I'm hoping this doesn't create a conflict somewhere...

Also, THANK YOU!

78Alpha · 2023-01-13T17:29:55Z

And adding in for it not finding GPUs, I noticed 'Torch is not able to use GPU; add --skip-torch-cuda-test in the log. From my experience, it's an issue with accelerate. Running it from a script or anaconda/env gives it trouble. Running the script directly will work or will at least provide an error to work with. Accelerate tends to spit out a message equal to An error has occurred because an error has occurred

You can try running accelerate on something from a test venv or your main interpreter and see if it can find the GPUs.

Omegadarling · 2023-01-13T17:56:22Z

@78Alpha Could it be something as simple as how old my Nvidia driver is? I'm using 472.47, which came out on 2021.11.10. I had to use an older driver to use RNDR, but RNDR now works with newer drivers. The only reason I haven't updated is that it takes over an hour for the drivers to install. Something about PCIe enumeration just gets exponentially slower with each additional GPU.

78Alpha · 2023-01-13T21:01:03Z

I couldn't guarantee. I usually drop ACCELERATE all together when it starts complaining about not being able to find a GPU. Some anaconda environments or envs work, others won't. Seemed hit or miss. I myself used new and old drivers. An RTX card and even a Tesla P40.

All from a windows environment of course. 1 colab as well.

Omegadarling · 2023-01-14T03:09:01Z

I couldn't guarantee. I usually drop ACCELERATE all together when it starts complaining about not being able to find a GPU.

Are you using a different library for multi-GPU or just living with ONLY one GPU?

78Alpha · 2023-01-14T23:19:34Z

I couldn't guarantee. I usually drop ACCELERATE all together when it starts complaining about not being able to find a GPU.

Are you using a different library for multi-GPU or just living with ONLY one GPU?

I live with using just the 1 GPU

d8ahazard closed this as completed Jan 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to implement Accelerate from Hugging Face? #700

Is it possible to implement Accelerate from Hugging Face? #700

Omegadarling commented Jan 5, 2023

78Alpha commented Jan 5, 2023

d8ahazard commented Jan 6, 2023

Omegadarling commented Jan 6, 2023

78Alpha commented Jan 12, 2023

Omegadarling commented Jan 13, 2023

78Alpha commented Jan 13, 2023

Omegadarling commented Jan 13, 2023 •

edited

Loading

78Alpha commented Jan 13, 2023

Omegadarling commented Jan 13, 2023 •

edited

Loading

78Alpha commented Jan 13, 2023

Omegadarling commented Jan 14, 2023

78Alpha commented Jan 14, 2023

Is it possible to implement Accelerate from Hugging Face? #700

Is it possible to implement Accelerate from Hugging Face? #700

Comments

Omegadarling commented Jan 5, 2023

78Alpha commented Jan 5, 2023

d8ahazard commented Jan 6, 2023

Omegadarling commented Jan 6, 2023

78Alpha commented Jan 12, 2023

Omegadarling commented Jan 13, 2023

78Alpha commented Jan 13, 2023

Omegadarling commented Jan 13, 2023 • edited Loading

78Alpha commented Jan 13, 2023

Omegadarling commented Jan 13, 2023 • edited Loading

78Alpha commented Jan 13, 2023

Omegadarling commented Jan 14, 2023

78Alpha commented Jan 14, 2023

Omegadarling commented Jan 13, 2023 •

edited

Loading

Omegadarling commented Jan 13, 2023 •

edited

Loading