[tune/autoscaler/docker] GPU support in docker #8975

vtomenko · 2020-06-16T20:56:19Z

GPU support in docker

Reading docs on Automatic Cluster Setup, section "Docker":

This currently does not have GPU support

However, aws/example-gpu-docker.yaml coupled with merged PR suggests GPU is supported.

Could you please clarify and if GPU is indeed supported in docker - would docker be good choice from stability perspective?

richardliaw · 2020-06-16T21:02:22Z

Hey @vtomenko - you might run into some bugs with the current state of the code, but @ijrsvt is hard at work on improving this right now.

@ijrsvt feel free to chime in

vtomenko · 2020-06-17T05:36:39Z

Hey @richardliaw - thanks for your reply, I'll give it a try and let you know guys know what happened.

It is also great to hear you are working on this - I believe stable autoscaler + docker with GPU support combination would be really beneficial for many ML scenarios because it cleanly separates concerns:

high-level cluster description including instances details handled by cluster YAML
specific ML training task fully contained (including dependencies) and described using docker

In this way ML engineer could just change docker image reference in cluster YAML leaving remaining parts as is (as opposed to updating setup_commands for each different ML task). Then it would be really easy to execute different training tasks on the same cluster, provided ray handles correctly the sequence of (a) updating docker image reference in cluster YAML and (b) ray up

ijrsvt · 2020-06-17T06:41:24Z

@vtomenko Please do let us know what the result is for running it with GPUs! If you have any further questions or problems please let me know!

richardliaw · 2020-06-17T07:44:01Z

@vtomenko absolutely agree! Will keep you updated from our side too.

cc @anabranch @pcmoritz

vtomenko · 2020-06-18T06:32:48Z

I'm using ray version 0.8.5 and trying to test basic docker setup.

From autoscaler doc:

Docker: Specify docker image. This executes all commands on all nodes in the docker container, and opens all the necessary ports to support the Ray cluster. It will also automatically install Docker if Docker is not installed.

It does not seem to be the case, the error when creating the cluster with say busybox docker image:
Command 'docker' not found, but can be installed with:

Installing docker in initialization_commands section also does not help for reasons described in #7519

Here is what I have in initialization_commands:

[
"sudo apt update -y",
"sudo apt install docker.io -y",
"sudo usermod -aG docker $USER",
"sudo systemctl restart docker"
]

Commands above run fine. The next thing ray tries to pull the image, and this is where it fails:

2020-06-18 06:48:14,835 INFO updater.py:264 -- NodeUpdater: i-025cd6bcda77a92aa: Running sudo usermod -aG docker $USER on 35.165.161.154...
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
2020-06-18 06:48:14,916 INFO updater.py:264 -- NodeUpdater: i-025cd6bcda77a92aa: Running sudo systemctl restart docker on 35.165.161.154...
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
2020-06-18 06:48:16,019 INFO updater.py:264 -- NodeUpdater: i-025cd6bcda77a92aa: Running docker pull busybox:latest on 35.165.161.154...
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.40/images/create?fromImage=busybox&tag=latest: dial unix /var/run/docker.sock: connect: permission denied
2020-06-18 06:48:16,114 INFO log_timer.py:17 -- NodeUpdater: i-025cd6bcda77a92aa: Initialization commands completed [LogTimer=23330ms]
2020-06-18 06:48:16,114 INFO log_timer.py:17 -- NodeUpdater: i-025cd6bcda77a92aa: Applied config a8575940426af8a45f754a232f9474405ae13ee8 [LogTimer=51055ms]
2020-06-18 06:48:16,114 ERROR updater.py:359 -- NodeUpdater: i-025cd6bcda77a92aa: Error updating (Exit Status 1) ssh -i /home/ubuntu/.ssh/ray-autoscaler_1_us-west-2.pem -o ConnectTimeout=120s -o StrictHostKeyChecking=no -o ControlMaster=auto -o ControlPath=/tmp/ray_ssh_1d41c853af/e55cd6ce94/%C -o ControlPersist=10s -o IdentitiesOnly=yes -o ExitOnForwardFailure=yes -o ServerAliveInterval=5 -o ServerAliveCountMax=3 [email protected] bash --login -c -i 'true && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && docker pull busybox:latest'
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 362, in run
raise e
File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 352, in run
self.do_update()
File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 436, in do_update
self.cmd_runner.run(cmd)
File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 274, in run
self.process_runner.check_call(final_cmd)
File "/usr/lib/python3.6/subprocess.py", line 311, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['ssh', '-i', '/home/ubuntu/.ssh/ray-autoscaler_1_us-west-2.pem', '-o', 'ConnectTimeout=120s', '-o', 'StrictHostKeyChecking=no', '-o', 'ControlMaster=auto', '-o', 'ControlPath=/tmp/ray_ssh_1d41c853af/e55cd6ce94/%C', '-o', 'ControlPersist=10s', '-o', 'IdentitiesOnly=yes', '-o', 'ExitOnForwardFailure=yes', '-o', 'ServerAliveInterval=5', '-o', 'ServerAliveCountMax=3', '[email protected]', 'bash', '--login', '-c', '-i', "'true && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && docker pull busybox:latest'"]' returned non-zero exit status 1.

Note: when I manually attach to cluster after this error and run docker pull busybox (as non root user) it works as expected.

Can you please point me to any currently working example on how to configure autoscaler with docker?

ijrsvt · 2020-06-18T16:25:40Z

Hmm, let me try working on a solution to this--if you rerun after the first install, does it work?
Also what are you running on: a local cluster or public cloud?

vtomenko · 2020-06-18T17:42:31Z

I use AWS, attaching configuration I tried

Rerun does not help:

2020-06-18 08:01:00,874 INFO updater.py:264 -- NodeUpdater: i-0374e72fcc8b8ea2a: Running docker inspect -f '{{.State.Running}}' busybox || docker run --rm --name busybox -d -it -p 6379:6379 -p 8076:8076 -p 4321:4321 -e LC_ALL=C.UTF-8 -e LANG=C.UTF-8 --net=host busybox:latest bash on 54.69.9.155...
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
Template parsing error: template: :1:8: executing "" at <.State.Running>: map has no entry for key "State"
WARNING: Published ports are discarded when using host network mode
e2735eef81dcd0dd304fd131ab7c092c68556b8414e908449bd925fb08810bd7
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: "bash": executable file not found in $PATH": unknown.
2020-06-18 08:01:01,527 INFO log_timer.py:17 -- NodeUpdater: i-0374e72fcc8b8ea2a: Setup commands completed [LogTimer=653ms]
2020-06-18 08:01:01,527 INFO log_timer.py:17 -- NodeUpdater: i-0374e72fcc8b8ea2a: Applied config 9784c6630decbe7b8ad6cb2a34f12c3c48314063 [LogTimer=2745ms]
2020-06-18 08:01:01,527 ERROR updater.py:359 -- NodeUpdater: i-0374e72fcc8b8ea2a: Error updating (Exit Status 127) ssh -i /home/ubuntu/.ssh/ray-autoscaler_1_us-west-2.pem -o ConnectTimeout=120s -o StrictHostKeyChecking=no -o ControlMaster=auto -o ControlPath=/tmp/ray_ssh_1d41c853af/e55cd6ce94/%C -o ControlPersist=10s -o IdentitiesOnly=yes -o ExitOnForwardFailure=yes -o ServerAliveInterval=5 -o ServerAliveCountMax=3 [email protected] bash --login -c -i 'true && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && docker inspect -f '"'"'{{.State.Running}}'"'"' busybox || docker run --rm --name busybox -d -it -p 6379:6379 -p 8076:8076 -p 4321:4321 -e LC_ALL=C.UTF-8 -e LANG=C.UTF-8 --net=host busybox:latest bash'
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 362, in run
raise e
File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 352, in run
self.do_update()
File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 440, in do_update
self.cmd_runner.run(cmd)
File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 274, in run
self.process_runner.check_call(final_cmd)
File "/usr/lib/python3.6/subprocess.py", line 311, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['ssh', '-i', '/home/ubuntu/.ssh/ray-autoscaler_1_us-west-2.pem', '-o', 'ConnectTimeout=120s', '-o', 'StrictHostKeyChecking=no', '-o', 'ControlMaster=auto', '-o', 'ControlPath=/tmp/ray_ssh_1d41c853af/e55cd6ce94/%C', '-o', 'ControlPersist=10s', '-o', 'IdentitiesOnly=yes', '-o', 'ExitOnForwardFailure=yes', '-o', 'ServerAliveInterval=5', '-o', 'ServerAliveCountMax=3', '[email protected]', 'bash', '--login', '-c', '-i', ''true && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && docker inspect -f '"'"'{{.State.Running}}'"'"' busybox || docker run --rm --name busybox -d -it -p 6379:6379 -p 8076:8076 -p 4321:4321 -e LC_ALL=C.UTF-8 -e LANG=C.UTF-8 --net=host busybox:latest bash'']' returned non-zero exit status 127.

ijrsvt · 2020-06-18T18:27:22Z

I'll try using that AMI! Thanks for sharing this!

ijrsvt · 2020-06-18T21:00:40Z

The issue is that we reuse SSH sessions. I'm working on a PR now, but in the meantime there is a subpart workaround:

Run ray up, with the initialization commands you have above
Remove the initialization commands
Rerun ray up in about 15 seconds or so.

vtomenko · 2020-06-18T22:49:21Z

Thanks @ijrsvt , I tried the workaround and it fails. The error seems to be similar to the one reported previously for rerun. Does the workaround work for you?

2020-06-18 22:45:57,181 INFO updater.py:264 -- NodeUpdater: i-0230e74237fc85cc4: Running docker inspect -f '{{.State.Running}}' busybox || docker run --rm --name busybox -d -it -p 6379:6379 -p 8076:8076 -p 4321:4321 -e LC_ALL=C.UTF-8 -e LANG=C.UTF-8 --net=host busybox:latest bash on 54.187.139.187...
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
Template parsing error: template: :1:8: executing "" at <.State.Running>: map has no entry for key "State"
WARNING: Published ports are discarded when using host network mode
99e67b7ac1327a73ee5f81be226f96a3712d785d5475080c511c60bddbe40609
docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: "bash": executable file not found in $PATH": unknown.
2020-06-18 22:45:57,860 INFO log_timer.py:17 -- NodeUpdater: i-0230e74237fc85cc4: Setup commands completed [LogTimer=679ms]
2020-06-18 22:45:57,861 INFO log_timer.py:17 -- NodeUpdater: i-0230e74237fc85cc4: Applied config 0318fe55a69a325f60716771d1a6f9f9a36457ec [LogTimer=3565ms]
2020-06-18 22:45:57,861 ERROR updater.py:359 -- NodeUpdater: i-0230e74237fc85cc4: Error updating (Exit Status 127) ssh -i /home/ubuntu/.ssh/ray-autoscaler.pem -o ConnectTimeout=120s -o StrictHostKeyChecking=no -o ControlMaster=auto -o ControlPath=/tmp/ray_ssh_1d41c853af/e55cd6ce94/%C -o ControlPersist=10s -o IdentitiesOnly=yes -o ExitOnForwardFailure=yes -o ServerAliveInterval=5 -o ServerAliveCountMax=3 [email protected] bash --login -c -i 'true && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && docker inspect -f '"'"'{{.State.Running}}'"'"' busybox || docker run --rm --name busybox -d -it -p 6379:6379 -p 8076:8076 -p 4321:4321 -e LC_ALL=C.UTF-8 -e LANG=C.UTF-8 --net=host busybox:latest bash'
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 362, in run
raise e
File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 352, in run
self.do_update()
File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 440, in do_update
self.cmd_runner.run(cmd)
File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 274, in run
self.process_runner.check_call(final_cmd)
File "/usr/lib/python3.6/subprocess.py", line 311, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['ssh', '-i', '/home/ubuntu/.ssh/ray-autoscaler.pem', '-o', 'ConnectTimeout=120s', '-o', 'StrictHostKeyChecking=no', '-o', 'ControlMaster=auto', '-o', 'ControlPath=/tmp/ray_ssh_1d41c853af/e55cd6ce94/%C', '-o', 'ControlPersist=10s', '-o', 'IdentitiesOnly=yes', '-o', 'ExitOnForwardFailure=yes', '-o', 'ServerAliveInterval=5', '-o', 'ServerAliveCountMax=3', '[email protected]', 'bash', '--login', '-c', '-i', ''true && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && docker inspect -f '"'"'{{.State.Running}}'"'"' busybox || docker run --rm --name busybox -d -it -p 6379:6379 -p 8076:8076 -p 4321:4321 -e LC_ALL=C.UTF-8 -e LANG=C.UTF-8 --net=host busybox:latest bash'']' returned non-zero exit status 127.

ijrsvt · 2020-06-18T23:19:59Z

I think this error is because the docker image (busybox) in your YAML does not have bash installed.

…

On Thu, Jun 18, 2020 at 3:49 PM Volodymyr Tomenko ***@***.***> wrote: Thanks @ijrsvt <https://github.com/ijrsvt> , I tried the workaround and it fails. The error seems to be similar to the one reported previously for rerun. Does the workaround work for you? 2020-06-18 22:45:57,181 INFO updater.py:264 -- NodeUpdater: i-0230e74237fc85cc4: Running docker inspect -f '{{.State.Running}}' busybox || docker run --rm --name busybox -d -it -p 6379:6379 -p 8076:8076 -p 4321:4321 -e LC_ALL=C.UTF-8 -e LANG=C.UTF-8 --net=host busybox:latest bash on 54.187.139.187... bash: cannot set terminal process group (-1): Inappropriate ioctl for device bash: no job control in this shell Template parsing error: template: :1:8: executing "" at <.State.Running>: map has no entry for key "State" WARNING: Published ports are discarded when using host network mode 99e67b7ac1327a73ee5f81be226f96a3712d785d5475080c511c60bddbe40609 docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused "exec: "bash": executable file not found in $PATH": unknown. 2020-06-18 22:45:57,860 INFO log_timer.py:17 -- NodeUpdater: i-0230e74237fc85cc4: Setup commands completed [LogTimer=679ms] 2020-06-18 22:45:57,861 INFO log_timer.py:17 -- NodeUpdater: i-0230e74237fc85cc4: Applied config 0318fe55a69a325f60716771d1a6f9f9a36457ec [LogTimer=3565ms] 2020-06-18 22:45:57,861 ERROR updater.py:359 -- NodeUpdater: i-0230e74237fc85cc4: Error updating (Exit Status 127) ssh -i /home/ubuntu/.ssh/ray-autoscaler.pem -o ConnectTimeout=120s -o StrictHostKeyChecking=no -o ControlMaster=auto -o ControlPath=/tmp/ray_ssh_1d41c853af/e55cd6ce94/%C -o ControlPersist=10s -o IdentitiesOnly=yes -o ExitOnForwardFailure=yes -o ServerAliveInterval=5 -o ServerAliveCountMax=3 ***@***.*** bash --login -c -i 'true && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && docker inspect -f '"'"'{{.State.Running}}'"'"' busybox || docker run --rm --name busybox -d -it -p 6379:6379 -p 8076:8076 -p 4321:4321 -e LC_ALL=C.UTF-8 -e LANG=C.UTF-8 --net=host busybox:latest bash' Exception in thread Thread-2: Traceback (most recent call last): File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner self.run() File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 362, in run raise e File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 352, in run self.do_update() File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 440, in do_update self.cmd_runner.run(cmd) File "/home/ubuntu/projects/ai-ml-model-search/env/lib/python3.6/site-packages/ray/autoscaler/updater.py", line 274, in run self.process_runner.check_call(final_cmd) File "/usr/lib/python3.6/subprocess.py", line 311, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['ssh', '-i', '/home/ubuntu/.ssh/ray-autoscaler.pem', '-o', 'ConnectTimeout=120s', '-o', 'StrictHostKeyChecking=no', '-o', 'ControlMaster=auto', '-o', 'ControlPath=/tmp/ray_ssh_1d41c853af/e55cd6ce94/%C', '-o', 'ControlPersist=10s', '-o', 'IdentitiesOnly=yes', '-o', 'ExitOnForwardFailure=yes', '-o', 'ServerAliveInterval=5', '-o', 'ServerAliveCountMax=3', ***@***.***', 'bash', '--login', '-c', '-i', ''true && source ~/.bashrc && export OMP_NUM_THREADS=1 PYTHONWARNINGS=ignore && docker inspect -f '"'"'{{.State.Running}}'"'"' busybox || docker run --rm --name busybox -d -it -p 6379:6379 -p 8076:8076 -p 4321:4321 -e LC_ALL=C.UTF-8 -e LANG=C.UTF-8 --net=host busybox:latest bash'']' returned non-zero exit status 127. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8975 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AFC5KQX7IJC6GDNYE6KWCR3RXKKX5ANCNFSM4N77F7FA> .

vtomenko · 2020-06-19T01:26:58Z

Makes sense, thank you @ijrsvt ! I will update the YAML and use the docker image I compiled to train ML models and see what happens.

ijrsvt · 2020-06-19T02:11:00Z

Awesome--I have a PR out to automatically install docker if it is not preinstalled; this won't be a problem in the future!

vtomenko · 2020-06-25T22:42:43Z

Hey @ijrsvt , thank you for the PR - any chance it is going to be merged into master soon so that I can avoid the workaround?

ijrsvt · 2020-06-26T21:36:54Z

@vtomenko I am closing it for the moment because it actually breaks some of the autoscaler--It should be merged in about a week

ijrsvt · 2020-07-20T18:09:29Z

@vtomenko any updates on your end? How are things going?

vtomenko · 2020-07-24T04:20:41Z

@ijrsvt , I got it working with the following configuration for AWS:

head/worker nodes: Deep Learning Base AMI (Ubuntu 18.04) (so that all required GPU drivers etc. and docker is installed)
instance type: g4dn.xlarge
docker image: based on nvidia/cuda:10.2-cudnn7-runtime-ubuntu18.04 (so that container can use GPUs) + in my case mxnet packages with GPU support
for custom docker repository: added "docker login ..." to initialization_commands section in cluster YAML.

Note: I also added openssh-client to docker image because autoscaler manages worker nodes via ssh from docker container on head node

ijrsvt · 2020-07-24T17:11:34Z

@vtomenko Great to hear!! The autoscaler should work without needing to directly SSH into the containers on the worker nodes. Was this not happening for you?

vtomenko · 2020-07-24T18:29:15Z

@ijrsvt , my understanding is that autoscaler ssh into worker node from docker container on head node. So if the container on head node does not have ssh installed, worker nodes are not configured properly. Looks like same issue here #5496

ijrsvt · 2020-07-24T18:50:02Z

Oh--that totally makes sense--my bad. I mis-read that as an openssh-server. The general requirements of the autoscaler are captured in this Dockerfile. {I'll make sure to add SSH into this} Please let me know if you have any other problems/questions/feedback! I'm always happy to help!

richardliaw · 2021-07-16T05:50:21Z

Should be supported now. Try using rayproject/ray:latest-gpu.

vtomenko added the question Just a question :) label Jun 16, 2020

rkooo567 added the triage Needs triage (eg: priority, bug/not-bug, and owning component) label Jun 17, 2020

ijrsvt mentioned this issue Jun 18, 2020

[autoscaler] Run initialization_commands without a persistent connection #9020

Merged

6 tasks

edoakes added P2 Important issue, but not time-critical enhancement Request for new feature and/or capability and removed question Just a question :) triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jun 26, 2020

edoakes assigned ijrsvt Jun 26, 2020

richardliaw added the tune Tune-related issues label Jul 8, 2020

richardliaw closed this as completed Jul 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tune/autoscaler/docker] GPU support in docker #8975

[tune/autoscaler/docker] GPU support in docker #8975

vtomenko commented Jun 16, 2020

richardliaw commented Jun 16, 2020

vtomenko commented Jun 17, 2020

ijrsvt commented Jun 17, 2020

richardliaw commented Jun 17, 2020

vtomenko commented Jun 18, 2020 •

edited

Loading

ijrsvt commented Jun 18, 2020

vtomenko commented Jun 18, 2020

ijrsvt commented Jun 18, 2020

ijrsvt commented Jun 18, 2020

vtomenko commented Jun 18, 2020

ijrsvt commented Jun 18, 2020 via email

vtomenko commented Jun 19, 2020

ijrsvt commented Jun 19, 2020

vtomenko commented Jun 25, 2020

ijrsvt commented Jun 26, 2020

ijrsvt commented Jul 20, 2020

vtomenko commented Jul 24, 2020 •

edited

Loading

ijrsvt commented Jul 24, 2020

vtomenko commented Jul 24, 2020

ijrsvt commented Jul 24, 2020 •

edited

Loading

richardliaw commented Jul 16, 2021

[tune/autoscaler/docker] GPU support in docker #8975

[tune/autoscaler/docker] GPU support in docker #8975

Comments

vtomenko commented Jun 16, 2020

GPU support in docker

richardliaw commented Jun 16, 2020

vtomenko commented Jun 17, 2020

ijrsvt commented Jun 17, 2020

richardliaw commented Jun 17, 2020

vtomenko commented Jun 18, 2020 • edited Loading

ijrsvt commented Jun 18, 2020

vtomenko commented Jun 18, 2020

ijrsvt commented Jun 18, 2020

ijrsvt commented Jun 18, 2020

vtomenko commented Jun 18, 2020

ijrsvt commented Jun 18, 2020 via email

vtomenko commented Jun 19, 2020

ijrsvt commented Jun 19, 2020

vtomenko commented Jun 25, 2020

ijrsvt commented Jun 26, 2020

ijrsvt commented Jul 20, 2020

vtomenko commented Jul 24, 2020 • edited Loading

ijrsvt commented Jul 24, 2020

vtomenko commented Jul 24, 2020

ijrsvt commented Jul 24, 2020 • edited Loading

richardliaw commented Jul 16, 2021

vtomenko commented Jun 18, 2020 •

edited

Loading

vtomenko commented Jul 24, 2020 •

edited

Loading

ijrsvt commented Jul 24, 2020 •

edited

Loading