Pass `CONDA_OVERRIDE_CUDA` to `with_cuda` of conda-lock #721

nkaretnikov · 2024-01-05T05:30:54Z

Fixes #719.

Description

This pull request makes it possible to set the CUDA version by passing the value of the CONDA_OVERRIDE_CUDA specification variable to the with_cuda parameter of conda-lock.

See https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-virtual.html#overriding-detected-packages

How I tested this:

Created this test env via the server admin environment page because conda-store UI strips variables even in raw YAML mode, so it doesn't work there:

channels:
- conda-forge
dependencies:
- pytorch
- ipykernel
- pip:
  - nothing
description: test2
name: test3
prefix: null
variables:
  CONDA_OVERRIDE_CUDA: '12.0'

Checked that the pytorch package in the generated lockfile has constraints matching this version of cuda. The url of the pytorch package should also reflect that, e.g., https://conda.anaconda.org/conda-forge/linux-64/pytorch-2.1.0-cuda120py312hfe5e8c6_301.conda.
Additionally, on a machine with an NVIDIA card that's configured to use CUDA:

% CONDA_STORE_USERNAME=test CONDA_STORE_PASSWORD=password  python -m conda_store --conda-store-url="http://localhost:8080/conda-store/" --auth=basic run test/test3:4 -- python
>>> import torch
>>> torch.__file__
'/tmp/conda-store/4/lib/python3.12/site-packages/torch/__init__.py'
>>> print(torch.version.cuda)
12.0

This version matches the one in the lockfile, which comes from the variable. Note: I had to set the url when calling conda-store CLI in order to match the url used by the server when running via docker.

Repeated the same with versions 10.0 and 11.0. Observed that these affect constraints in the pytorch package in the lockfile. Note that I've used different major versions as minor versions don't affect the constraints in the lockfile.

Pull request checklist

Did you test this change locally?
Did you update the documentation (if required)?
Did you add/update relevant tests for this change (if required)?

Additional information

netlify · 2024-01-05T05:31:00Z

✅ Deploy Preview for kaleidoscopic-dango-0cf31d ready!

Name	Link
🔨 Latest commit	`8c9390c`
🔍 Latest deploy log	https://app.netlify.com/sites/kaleidoscopic-dango-0cf31d/deploys/65a6ec80fa29ce00088fc76f
😎 Deploy Preview	https://deploy-preview-721--kaleidoscopic-dango-0cf31d.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

nkaretnikov · 2024-01-05T09:56:34Z

@amjames Do you have cycles to review this one?

amjames

One question otherwise looks good for a temp fix.

amjames · 2024-01-05T14:27:39Z

conda-store-server/conda_store_server/action/generate_lockfile.py

+    # https://github.com/conda-incubator/conda-store/issues/719
+    # https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-virtual.html#overriding-detected-packages
+    if specification.variables is not None:
+        cuda_version = specification.variables.get("CONDA_OVERRIDE_CUDA")


The CONDA_OVERRIDE_CUDA variable could come in as a string (expected by conda-lock) or a float.

The example from the description of this PR

... variables: CONDA_OVERRIDE_CUDA: '12.0'

Will come through as a string and be forwarded to conda-lock, but the example in the original issue has:

... variables: CONDA_OVERRIDE_CUDA: 11.8

Which would be parsed as float and be forwarded to conda-lock as a float.

I don't see a test update associated with this, so I am assuming we don't have any coverage here, could you try this out locally to see if we need to convert the value as we pull it off the specification?

Added a print to action_solve_lockfile to verify. It's always a string, even if you don't quote it. Tried with 12, 12.0, and 12.2.

(In fact, the admin UI renders previously unquoted versions in quotes once you submit an env.)

So no action is needed here.

You also asked offline whether using CONDA_OVERRIDE_CUDA on a machine without a GPU allows for using PyTorch with CUDA on a GPU-enabled machine later. I've tried the following:

On a machine with an Intel GPU (so no CUDA):

added the env from the top comment to conda-store, which has CONDA_OVERRIDE_CUDA: '12.0'

waited for it to generate the lockfile (but didn't wait for it to build)

downloaded the lockfile

On a machine with an NVIDIA GPU (and CUDA configured):

created an env from the downloaded lockfile via conda-lock install -n mytest1 --micromamba ~/mytest1.json

activated the env

confirmed that CUDA works:

>>> import torch >>> torch.__file__ '/home/nkaretnikov/.conda/envs/mytest1/lib/python3.12/site-packages/torch/__init__.py' >>> print(torch.version.cuda) 12.0 >>> x = torch.tensor(1, device='cuda') >>> x.device device(type='cuda', index=0) >>> x tensor(1, device='cuda:0')

Please let me know if you have any other questions!

nkaretnikov · 2024-01-08T05:03:12Z

@trallard Could you approve so I could merge? Andrew has reviewed this.

trallard · 2024-01-16T15:16:37Z

conda-store-server/conda_store_server/action/generate_lockfile.py

+    # conda-lock ignores variables defined in the specification, so this code
+    # gets the value of CONDA_OVERRIDE_CUDA and passes it to conda-lock via
+    # the with_cuda parameter, see:


Nit, so non blocking but since we said this is an "interim" solution it would be best to add also a TODO/something indicating that we should look at a more robust approach

trallard · 2024-01-16T15:16:57Z

Added a non-blocking comment otherwise we can merge

Fixes conda-incubator#719.

nkaretnikov added type: bug 🐛 Something isn't working area: configuration status: in progress 🏗 area: user experience 👩🏻‍💻 Items impacting the end-user experience block-release ⛔️ needs: review 👀 labels Jan 5, 2024

nkaretnikov marked this pull request as ready for review January 5, 2024 09:56

nkaretnikov added the area: dependencies 📦 Issues related to conda-store dependencies label Jan 5, 2024

amjames reviewed Jan 5, 2024

View reviewed changes

amjames mentioned this pull request Jan 5, 2024

[BUG] - Conda-Store strips environment vars from env.yaml spec. #719

Closed

nkaretnikov removed the needs: review 👀 label Jan 5, 2024

nkaretnikov requested a review from amjames January 5, 2024 22:57

nkaretnikov added the needs: review 👀 label Jan 5, 2024

amjames approved these changes Jan 6, 2024

View reviewed changes

nkaretnikov force-pushed the env-vars-719 branch from 08db9db to 8cdaafd Compare January 7, 2024 11:15

nkaretnikov removed the needs: review 👀 label Jan 7, 2024

nkaretnikov mentioned this pull request Jan 8, 2024

Allow passing variables via raw YAML conda-incubator/conda-store-ui#354

Merged

7 tasks

nkaretnikov added status: merge ready 🚀 and removed status: in progress 🏗 labels Jan 8, 2024

nkaretnikov added this to the Release 2024.1.1 milestone Jan 8, 2024

trallard requested a review from dcmcand January 9, 2024 17:34

nkaretnikov removed the block-release ⛔️ label Jan 9, 2024

dharhas mentioned this pull request Jan 9, 2024

[ENH] - Integrate conda-store 2024.1.1 release nebari-dev/nebari#2186

Closed

trallard approved these changes Jan 16, 2024

View reviewed changes

nkaretnikov added 2 commits January 16, 2024 21:50

Pass CONDA_OVERRIDE_CUDA to with_cuda of conda-lock

b3742de

Fixes conda-incubator#719.

Add a TODO comment

8c9390c

nkaretnikov force-pushed the env-vars-719 branch from 614b6ca to 8c9390c Compare January 16, 2024 20:52

nkaretnikov merged commit f63b4d7 into conda-incubator:main Jan 16, 2024
18 checks passed

nkaretnikov deleted the env-vars-719 branch January 16, 2024 21:27

nkaretnikov removed the status: merge ready 🚀 label Jan 16, 2024

dcmcand mentioned this pull request Feb 8, 2024

[ENH] - Ability to set environment variables on conda-store worker nebari-dev/nebari#1643

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass `CONDA_OVERRIDE_CUDA` to `with_cuda` of conda-lock #721

Pass `CONDA_OVERRIDE_CUDA` to `with_cuda` of conda-lock #721

nkaretnikov commented Jan 5, 2024 •

edited

Loading

netlify bot commented Jan 5, 2024 •

edited

Loading

nkaretnikov commented Jan 5, 2024

amjames left a comment

amjames Jan 5, 2024

nkaretnikov Jan 5, 2024

nkaretnikov commented Jan 8, 2024

trallard Jan 16, 2024

trallard commented Jan 16, 2024

Pass CONDA_OVERRIDE_CUDA to with_cuda of conda-lock #721

Pass CONDA_OVERRIDE_CUDA to with_cuda of conda-lock #721

Conversation

nkaretnikov commented Jan 5, 2024 • edited Loading

Description

Pull request checklist

Additional information

netlify bot commented Jan 5, 2024 • edited Loading

✅ Deploy Preview for kaleidoscopic-dango-0cf31d ready!

nkaretnikov commented Jan 5, 2024

amjames left a comment

Choose a reason for hiding this comment

amjames Jan 5, 2024

Choose a reason for hiding this comment

nkaretnikov Jan 5, 2024

Choose a reason for hiding this comment

nkaretnikov commented Jan 8, 2024

trallard Jan 16, 2024

Choose a reason for hiding this comment

trallard commented Jan 16, 2024

Pass `CONDA_OVERRIDE_CUDA` to `with_cuda` of conda-lock #721

Pass `CONDA_OVERRIDE_CUDA` to `with_cuda` of conda-lock #721

nkaretnikov commented Jan 5, 2024 •

edited

Loading

netlify bot commented Jan 5, 2024 •

edited

Loading