-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pass CONDA_OVERRIDE_CUDA
to with_cuda
of conda-lock
#721
Conversation
✅ Deploy Preview for kaleidoscopic-dango-0cf31d ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
@amjames Do you have cycles to review this one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question otherwise looks good for a temp fix.
# https://github.com/conda-incubator/conda-store/issues/719 | ||
# https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-virtual.html#overriding-detected-packages | ||
if specification.variables is not None: | ||
cuda_version = specification.variables.get("CONDA_OVERRIDE_CUDA") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CONDA_OVERRIDE_CUDA
variable could come in as a string (expected by conda-lock) or a float.
The example from the description of this PR
...
variables:
CONDA_OVERRIDE_CUDA: '12.0'
Will come through as a string and be forwarded to conda-lock, but the example in the original issue has:
...
variables:
CONDA_OVERRIDE_CUDA: 11.8
Which would be parsed as float and be forwarded to conda-lock as a float.
I don't see a test update associated with this, so I am assuming we don't have any coverage here, could you try this out locally to see if we need to convert the value as we pull it off the specification?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a print to action_solve_lockfile
to verify. It's always a string, even if you don't quote it. Tried with 12, 12.0, and 12.2.
(In fact, the admin UI renders previously unquoted versions in quotes once you submit an env.)
So no action is needed here.
You also asked offline whether using CONDA_OVERRIDE_CUDA
on a machine without a GPU allows for using PyTorch with CUDA on a GPU-enabled machine later. I've tried the following:
- On a machine with an Intel GPU (so no CUDA):
- added the env from the top comment to conda-store, which has
CONDA_OVERRIDE_CUDA: '12.0'
- waited for it to generate the lockfile (but didn't wait for it to build)
- downloaded the lockfile
- added the env from the top comment to conda-store, which has
- On a machine with an NVIDIA GPU (and CUDA configured):
- created an env from the downloaded lockfile via
conda-lock install -n mytest1 --micromamba ~/mytest1.json
- activated the env
- confirmed that CUDA works:
- created an env from the downloaded lockfile via
>>> import torch
>>> torch.__file__
'/home/nkaretnikov/.conda/envs/mytest1/lib/python3.12/site-packages/torch/__init__.py'
>>> print(torch.version.cuda)
12.0
>>> x = torch.tensor(1, device='cuda')
>>> x.device
device(type='cuda', index=0)
>>> x
tensor(1, device='cuda:0')
Please let me know if you have any other questions!
08db9db
to
8cdaafd
Compare
@trallard Could you approve so I could merge? Andrew has reviewed this. |
# conda-lock ignores variables defined in the specification, so this code | ||
# gets the value of CONDA_OVERRIDE_CUDA and passes it to conda-lock via | ||
# the with_cuda parameter, see: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, so non blocking but since we said this is an "interim" solution it would be best to add also a TODO/something indicating that we should look at a more robust approach
Added a non-blocking comment otherwise we can merge |
614b6ca
to
8c9390c
Compare
Fixes #719.
Description
This pull request makes it possible to set the CUDA version by passing the value of the
CONDA_OVERRIDE_CUDA
specification variable to thewith_cuda
parameter of conda-lock.See https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-virtual.html#overriding-detected-packages
How I tested this:
Checked that the pytorch package in the generated lockfile has constraints matching this version of cuda. The url of the pytorch package should also reflect that, e.g.,
https://conda.anaconda.org/conda-forge/linux-64/pytorch-2.1.0-cuda120py312hfe5e8c6_301.conda
.Additionally, on a machine with an NVIDIA card that's configured to use CUDA:
This version matches the one in the lockfile, which comes from the variable. Note: I had to set the url when calling conda-store CLI in order to match the url used by the server when running via docker.
10.0
and11.0
. Observed that these affect constraints in the pytorch package in the lockfile. Note that I've used different major versions as minor versions don't affect the constraints in the lockfile.Pull request checklist
Additional information