-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple issues related to the runpod
backend
#1133
Comments
peterschmidt85
added a commit
that referenced
this issue
Apr 15, 2024
@jvstme In theory, we could detect if the image has a non-default entrypoint automatically, skip the |
@peterschmidt85, I think it shouldn't be difficult. See example of detecting the image entrypoint |
5 tasks
5 tasks
TheBits
pushed a commit
that referenced
this issue
Apr 15, 2024
TheBits
pushed a commit
that referenced
this issue
Apr 15, 2024
Merged
Bihan
pushed a commit
to SMTM-Capital/dstack
that referenced
this issue
Apr 30, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
runpod/pytorch:2.1.1-py3.10-cuda12.1.1-devel-ubuntu22.04
instead ofdstack
's Docker image)runpod
backend always uses the Docker image's default entrypoint. In that case no configuration will work if the Docker image's default entrypoint isn'tbash
orsh
.22
SSH port on the container (instead of10022
as other backends do)registry_auth
runpod
bakend fails withf"Wait instance {instance_id} timeout"
and proceed to trying another offer – without terminating the pod that is being created. Ths leads to creating multiple pods instead of failing the job.runpod
backend failsand proceed to trying another offer – without terminating the pod that is being created. Ths leads to creating multiple pods instead of failing the job.
The text was updated successfully, but these errors were encountered: