Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile #174

Closed
slush0 opened this issue Mar 6, 2023 · 16 comments
Closed

Dockerfile #174

slush0 opened this issue Mar 6, 2023 · 16 comments
Labels

Comments

@slush0
Copy link

slush0 commented Mar 6, 2023

If there's any interest (to use it or add into this repo), I've knocked up Dockerfile.

https://github.com/slush0/docker-misc/blob/master/text-generation-webui/Dockerfile

It is also available at https://hub.docker.com/r/slush0/text-generation-webui.

@slush0 slush0 changed the title Docker file Dockerfile Mar 6, 2023
@phokur
Copy link

phokur commented Mar 17, 2023

I can't seem to connect to the instance locally once it's up. What is your docker run command?
docker run --gpus all -p 7860:7860 -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY -it slush0/text-generation-webui

@rkfg
Copy link

rkfg commented Mar 18, 2023

I think the problem is that there's no entry point or RUN line in that Dockerfile so the container terminates immediately.

@RedTopper
Copy link

If anyone happens to want to use Podman instead, I have a repo here called Text-Generation-Webui-Podman. It compiles and installs the GPTQ-for-LLaMa repo so 4bit works too.

In theory the Containerfile should be compatible with Docker, but I haven't tested it.

@tensiondriven
Copy link
Contributor

Does anyone know, is it possible to run multiple instance of this dockerfile using a single GPU? I'm not able to run multiple instances from bash; when I try, the second task exits with no error:

(with first instance already running)

$ python server.py --cai-chat --gptq-bits 4 --model llama-13b --listen-port 8002 --verbose
Loading llama-13b...
Loading model ...
Killed

Related to #58

@oobabooga
Copy link
Owner

I am interested in making this the default installation method to avoid all the setup trouble.

Here is what I have tried:

  1. Download the Dockerfile
  2. Build the image with
docker build . -f Dockerfile -t oobabooga
  1. Teleport into the image with
docker run -i -t oobabooga bash
  1. Download a test model
python3 download-model.py facebook/galactica-125m
  1. Try to start the web UI
python3 server.py

I get the error

RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

@nelsonjchen
Copy link

By default, Docker will not pass in the GPU.

What happens if you try to run:

docker run -it --rm --gpus all ubuntu nvidia-smi

from https://docs.docker.com/engine/reference/commandline/run/#gpus ?

@oobabooga
Copy link
Owner

No luck:

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

In your link, it is said that

First you need to install nvidia-container-runtime

Which I guess needs to be installed on the host operating system. Would that work on Windows?

@nelsonjchen
Copy link

I think that Docker instruction might be a bit old and/or a specific to Linux.

Try these more specific instructions specifically for WSL2:

https://docs.nvidia.com/cuda/wsl-user-guide/index.html#getting-started-with-cuda-on-wsl

@oobabooga
Copy link
Owner

oobabooga commented Mar 22, 2023

I managed to get 4-bit working with the Dockerfile below based on the original by @slush0. Two preliminary steps were necessary:

  1. Installing nvidia-container-runtime in the host OS.
  2. Changing the docker configuration file mentioned in the comment below before building the image. This is necessary in order to get nvcc working in the container.

pytorch/extension-cpp#71 (comment)

Otherwise I would get the error

     arch_list[-1] += '+PTX'
IndexError: list index out of range

Dockerfile

FROM pytorch/pytorch

# Install base utilities
RUN apt-get update && apt-get install -y build-essential wget git vim
RUN apt-get clean && rm -rf /var/lib/apt/lists/*

VOLUME /data

RUN git clone https://github.com/oobabooga/text-generation-webui /app
WORKDIR /app
RUN pip install -r requirements.txt
RUN rm -rf /app/models
RUN ln -s /data /app/models

RUN mkdir repositories
WORKDIR /app/repositories
RUN git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa
WORKDIR /app/repositories/GPTQ-for-LLaMa
RUN python setup_cuda.py install
WORKDIR /app

#ENV PATH=$PATH:/app

Connecting

I used this command to connect to it:

docker run -it --rm --gpus all --net=host oobabooga bash

Not the most user friendly setup because of the preliminary steps, but for people who already have experience with Docker this should be useful. Building the image itself is trivial.

@nelsonjchen
Copy link

Valorant won't run on my Windows PC for some stupid reason at the moment and I was planning to reformat and reinstall my PC with another SSD as a temporary test to see if that was the path to get it working. It's been long overdue.

I bring this up because I'll also see if I can get the most minimal steps re: Docker + Nvidia + WSL2 + Windows going as well. I think nvidia-container-runtime is more of a Linux-ism and with Windows, there's a bit alternative MicrosoftWSL2+DockerInc ingredient which I linked.

@RedTopper
Copy link

@oobabooga If it helps your cause, I updated my repo to also support Docker. It has some extras goodies like persisting the container's data, having a smaller final build size, and makes sure to cache downloaded pip packages.

Containerfile and small Conversion Script so it works with Docker.

I'd like to make sure it's helpful for people so if anyone has a problem with it feel free to open an issue.

@phokur
Copy link

phokur commented Mar 23, 2023

I finally figured out my port passing issue.
--net=host is not supported in Docker Desktop on Windows + WSL2.
You need to use -p 7860:7860 in the docker run command AND
python3 server.py --listen (or it will just throw reset connections in the host's browser)

@suhr
Copy link

suhr commented Mar 25, 2023

When I try to run it in docker I get the following error:

Traceback (most recent call last):
  File "/app/server.py", line 234, in <module>
    shared.model, shared.tokenizer = load_model(shared.model_name)
  File "/app/modules/models.py", line 49, in load_model
    model = AutoModelForCausalLM.from_pretrained(Path(f"models/{shared.model_name}"), device_map='auto', load_in_8bit=True)
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 471, in from_pretrained
    return model_class.from_pretrained(
  File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2643, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2966, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py", line 673, in _load_state_dict_into_meta_model
    set_module_8bit_tensor_to_device(model, param_name, param_device, value=param)
  File "/opt/conda/lib/python3.10/site-packages/transformers/utils/bitsandbytes.py", line 70, in set_module_8bit_tensor_to_device
    new_value = bnb.nn.Int8Params(new_value, requires_grad=False, has_fp16_weights=has_fp16_weights).to(device)
  File "/opt/conda/lib/python3.10/site-packages/bitsandbytes/nn/modules.py", line 196, in to
    return self.cuda(device)
  File "/opt/conda/lib/python3.10/site-packages/bitsandbytes/nn/modules.py", line 160, in cuda
    CB, CBt, SCB, SCBt, coo_tensorB = bnb.functional.double_quant(B)
  File "/opt/conda/lib/python3.10/site-packages/bitsandbytes/functional.py", line 1616, in double_quant
    row_stats, col_stats, nnz_row_ptr = get_colrow_absmax(
  File "/opt/conda/lib/python3.10/site-packages/bitsandbytes/functional.py", line 1505, in get_colrow_absmax
    lib.cget_col_row_stats(ptrA, ptrRowStats, ptrColStats, ptrNnzrows, ct.c_float(threshold), rows, cols)
  File "/opt/conda/lib/python3.10/ctypes/__init__.py", line 387, in __getattr__
    func = self.__getitem__(name)
  File "/opt/conda/lib/python3.10/ctypes/__init__.py", line 392, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /opt/conda/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cget_col_row_stats

@jatinsne
Copy link

Error response from daemon: could not select device driver "nvidia" with capabilities: [[gpu]]

@raymondbernard
Copy link

It would be great to have a CPU-only installation with Dockers!

@github-actions github-actions bot added the stale label Dec 7, 2023
Copy link

github-actions bot commented Dec 7, 2023

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

@github-actions github-actions bot closed this as completed Dec 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants