-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dockerfile / docker-compose to help streamline build process #547
Conversation
Thanks! I was just about to start work on a similar PR. I'm testing it now. I think it would make more sense to use the source from the current directory, rather than pulling from the public git repo. This would make it easier for devs to test their patches within an isolated environment. |
Unfortunately, it looks like testing failed:
|
It looks like it wants this patch: Bumping the GPTQ SHA to 841feedde876785bc8022ca48fd9c3ff626587e2 gets past this |
@deece tried setting the specific TORCH_CUDA_ARCH_LIST in the docker-compose to what your graphics card needs? the error you posted indicated that you didnt |
Yup, my oldest card is an M40, which requires that patch.
On 25 March 2023 8:07:28 pm AEDT, loeken ***@***.***> wrote:
@deece tried setting the specific TORCH_CUDA_ARCH_LIST in the docker-compose to what your graphics card needs?
--
Reply to this email directly or view it on GitHub:
#547 (comment)
You are receiving this because you were mentioned.
Message ID: ***@***.***>
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
|
@deece with M40 do you mean a Quadro M4000 ? |
Tesla M40. I also have a Tesla K80, but it doesn't really get used.
On 25 March 2023 8:14:30 pm AEDT, loeken ***@***.***> wrote:
@deece with M40 do you mean a Quadro M4000 ?
--
Reply to this email directly or view it on GitHub:
#547 (comment)
You are receiving this because you were mentioned.
Message ID: ***@***.***>
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
|
https://developer.nvidia.com/cuda-gpus <- based on the docs page your M40 expects version 5.2 try changing TORCH_CUDA_ARCH_LIST from 7.5 to 5.2 in the docker-compose.yml |
…ser to edit their configurables in one place and not be confused by the docker-compose/Dockerfile structure
…riant with gptq-pre-layer 20
I don't think that will work, as the patch mentioned above suggests that it will break for anything under 6.0.
That patch does work though, and ask that is needed to get it is to roll the pinned commit forward a bit (I tested the current HEAD and that worked).
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
|
@deece I tried your suggested sha 841feedde876785bc8022ca48fd9c3ff626587e2 and HEAD which made it fail with load_quant() missing 1 required positional argument: 'pre_layer' I ve updated the PR and moved all configs into an .env file which might make it easier to test/compare |
Thanks, I'm out all day tomorrow, but I'll have another crack on Monday
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
|
@deece it now uses HEAD, updated it to work with the new changes ( https://github.com/oobabooga/text-generation-webui/wiki/LLaMA-model#4-bit-mode ) |
How about also preloading extentions into the docker image? |
@MarlinMr mapped the extensions folder ( and a few more others, in the docker-compose ) |
Yeah, it makes sense for local configuration. But I was thinking more like pulling dependencies for the current supported extensions into the docker image. |
@MarlinMr running pip3 installs for the extensions too now, using the same caching as with the others, also added port 5000 for the api via docker-compose |
@oobabooga mind merging this? would make it easier to hop branches and test in docker |
It might be worth squashing/refactoring the commits before merging the PR. Maybe even squashing it down to a single commit? |
There's a couple of missing variables from the sample env file:
|
yeah this PR has turned a bit into a mess i ll close this one and create a new clean one |
Wanted to run in docker, used https://github.com/RedTopper`s version in #174 as a base, modified slightly
added small section in readme to explain how to start up, the defaults of this config run with < 4GB of vram