Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install cupy without explicitly requesting cuda-version now defaults to CUDA 11.8 instead of CUDA 12 #247

Closed
1 task done
leofang opened this issue Jan 26, 2024 · 14 comments · Fixed by #249
Closed
1 task done
Labels
question Further information is requested

Comments

@leofang
Copy link
Member

leofang commented Jan 26, 2024

Solution to issue cannot be found in the documentation.

  • I checked the documentation.

Issue

Either mamba create -n my_env cupy or conda create -n my_env cupy now defaults to CUDA 11.8:

$ mamba create -n pppppp cupy

Looking for: ['cupy']

conda-forge/noarch                                  13.3MB @  28.8MB/s  0.6s
conda-forge/linux-64                                32.0MB @  47.3MB/s  1.2s
Transaction

  Prefix: /home/leof/miniforge3/envs/pppppp

  Updating specs:

   - cupy


  Package                Version  Build                Channel           Size
───────────────────────────────────────────────────────────────────────────────
  Install:
───────────────────────────────────────────────────────────────────────────────

  + python_abi              3.12  4_cp312              conda-forge        6kB
  + _libgcc_mutex            0.1  conda_forge          conda-forge        3kB
  + libstdcxx-ng          13.2.0  h7e041cc_3           conda-forge        4MB
  + ld_impl_linux-64        2.40  h41732ed_0           conda-forge     Cached
  + ca-certificates   2023.11.17  hbcca054_0           conda-forge     Cached
  + libgomp               13.2.0  h807b86a_3           conda-forge      422kB
  + _openmp_mutex            4.5  2_gnu                conda-forge       24kB
  + libgcc-ng             13.2.0  h807b86a_3           conda-forge      774kB
  + libgfortran5          13.2.0  ha4646dd_3           conda-forge        1MB
  + openssl                3.2.0  hd590300_1           conda-forge     Cached
  + libxcrypt             4.4.36  hd590300_1           conda-forge     Cached
  + libzlib               1.2.13  hd590300_5           conda-forge     Cached
  + libffi                 3.4.2  h7f98852_5           conda-forge     Cached
  + bzip2                  1.0.8  hd590300_5           conda-forge      254kB
  + ncurses                  6.4  h59595ed_2           conda-forge      884kB
  + cudatoolkit           11.8.0  h4ba93d1_12          conda-forge      716MB
  + libuuid               2.38.1  h0b41bf4_0           conda-forge     Cached
  + libnsl                 2.0.1  hd590300_0           conda-forge       33kB
  + libexpat               2.5.0  hcb278e6_1           conda-forge     Cached
  + xz                     5.2.6  h166bdaf_0           conda-forge     Cached
  + libgfortran-ng        13.2.0  h69a702a_3           conda-forge       24kB
  + tk                    8.6.13  noxft_h4845f30_101   conda-forge        3MB
  + libsqlite             3.44.2  h2797004_0           conda-forge      846kB
  + readline                 8.2  h8228510_1           conda-forge     Cached
  + libopenblas           0.3.26  pthreads_h413a1c8_0  conda-forge        6MB
  + libblas                3.9.0  21_linux64_openblas  conda-forge       15kB
  + libcblas               3.9.0  21_linux64_openblas  conda-forge       15kB
  + liblapack              3.9.0  21_linux64_openblas  conda-forge       15kB
  + cuda-version            11.8  h70ddcb2_2           conda-forge       21kB
  + tzdata                 2023d  h0c530f3_0           conda-forge      120kB
  + python                3.12.1  hab00c5b_1_cpython   conda-forge       32MB
  + wheel                 0.42.0  pyhd8ed1ab_0         conda-forge       58kB
  + setuptools            69.0.3  pyhd8ed1ab_0         conda-forge      471kB
  + pip                   23.3.2  pyhd8ed1ab_0         conda-forge        1MB
  + fastrlock              0.8.2  py312h30efb56_2      conda-forge       38kB
  + numpy                 1.26.3  py312heda63a1_0      conda-forge        7MB
  + cupy-core             13.0.0  py312hd7a312d_1      conda-forge       45MB
  + cupy                  13.0.0  py312h8e83189_1      conda-forge      353kB

  Summary:

  Install: 38 packages

  Total download: 821MB

───────────────────────────────────────────────────────────────────────────────


Confirm changes: [Y/n]

I expect the latest CUDA to be used.

Installed packages

# packages in environment at /home/leof/miniforge3:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
archspec                  0.2.2              pyhd8ed1ab_0    conda-forge
boltons                   23.0.0             pyhd8ed1ab_0    conda-forge
brotlipy                  0.7.0           py39hb9d737c_1005    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.25.0               hd590300_0    conda-forge
ca-certificates           2023.11.17           hbcca054_0    conda-forge
certifi                   2023.11.17         pyhd8ed1ab_0    conda-forge
cffi                      1.15.1           py39he91dace_3    conda-forge
charset-normalizer        3.1.0              pyhd8ed1ab_0    conda-forge
colorama                  0.4.6              pyhd8ed1ab_0    conda-forge
conda                     23.11.0          py39hf3d152e_1    conda-forge
conda-libmamba-solver     23.12.0            pyhd8ed1ab_0    conda-forge
conda-package-handling    2.2.0              pyh38be061_0    conda-forge
conda-package-streaming   0.9.0              pyhd8ed1ab_0    conda-forge
cryptography              41.0.7           py39he6105cc_1    conda-forge
distro                    1.9.0              pyhd8ed1ab_0    conda-forge
fmt                       10.1.1               h00ab1b0_1    conda-forge
icu                       72.1                 hcb278e6_0    conda-forge
idna                      3.4                pyhd8ed1ab_0    conda-forge
jsonpatch                 1.32               pyhd8ed1ab_0    conda-forge
jsonpointer               2.0                        py_0    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
krb5                      1.21.2               h659d440_0    conda-forge
ld_impl_linux-64          2.40                 h41732ed_0    conda-forge
libarchive                3.7.2                h039dbb9_0    conda-forge
libcurl                   8.5.0                hca28451_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 12.2.0              h65d4601_19    conda-forge
libgomp                   12.2.0              h65d4601_19    conda-forge
libiconv                  1.17                 h166bdaf_0    conda-forge
libmamba                  1.5.6                had39da4_0    conda-forge
libmambapy                1.5.6            py39h10defb6_0    conda-forge
libnghttp2                1.58.0               h47da74e_1    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libsolv                   0.7.23               h3eb15da_0    conda-forge
libsqlite                 3.40.0               h753d276_0    conda-forge
libssh2                   1.11.0               h0841786_0    conda-forge
libstdcxx-ng              12.2.0              h46fd767_19    conda-forge
libuuid                   2.38.1               h0b41bf4_0    conda-forge
libxml2                   2.11.5               h0d562d8_0    conda-forge
libzlib                   1.2.13               h166bdaf_4    conda-forge
lz4-c                     1.9.4                hcb278e6_0    conda-forge
lzo                       2.10              h516909a_1000    conda-forge
mamba                     1.5.6            py39hc5d2bb1_0    conda-forge
menuinst                  2.0.2            py39hf3d152e_0    conda-forge
ncurses                   6.3                  h27087fc_1    conda-forge
openssl                   3.2.0                hd590300_1    conda-forge
packaging                 23.0               pyhd8ed1ab_0    conda-forge
pip                       21.2.4             pyhd8ed1ab_0    conda-forge
platformdirs              4.1.0              pyhd8ed1ab_0    conda-forge
pluggy                    1.0.0              pyhd8ed1ab_5    conda-forge
pybind11-abi              4                    hd8ed1ab_3    conda-forge
pycosat                   0.6.4            py39hb9d737c_1    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pyopenssl                 23.3.0             pyhd8ed1ab_0    conda-forge
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
python                    3.9.16          h2782a2a_0_cpython    conda-forge
python_abi                3.9                      3_cp39    conda-forge
readline                  8.2                  h8228510_1    conda-forge
reproc                    14.2.4               h0b41bf4_0    conda-forge
reproc-cpp                14.2.4               hcb278e6_0    conda-forge
requests                  2.28.2             pyhd8ed1ab_1    conda-forge
ruamel.yaml               0.17.21          py39h72bdee0_3    conda-forge
ruamel.yaml.clib          0.2.7            py39h72bdee0_1    conda-forge
ruamel_yaml               0.15.80         py39hb9d737c_1008    conda-forge
setuptools                67.6.1             pyhd8ed1ab_0    conda-forge
sqlite                    3.40.0               h4ff8645_0    conda-forge
tk                        8.6.12               h27826a3_0    conda-forge
toolz                     0.12.0             pyhd8ed1ab_0    conda-forge
tqdm                      4.65.0             pyhd8ed1ab_1    conda-forge
tzdata                    2023c                h71feb2d_0    conda-forge
urllib3                   1.26.15            pyhd8ed1ab_0    conda-forge
wheel                     0.40.0             pyhd8ed1ab_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
yaml-cpp                  0.8.0                h59595ed_0    conda-forge
zlib                      1.2.13               h166bdaf_4    conda-forge
zstandard                 0.19.0           py39hb9d737c_0    conda-forge
zstd                      1.5.5                hfc55251_0    conda-forge

Environment info

active environment : base
    active env location : /home/leof/miniforge3
            shell level : 1
       user config file : /home/leof/.condarc
 populated config files : /home/leof/miniforge3/.condarc
                          /home/leof/.condarc
          conda version : 23.11.0
    conda-build version : not installed
         python version : 3.9.16.final.0
                 solver : libmamba (default)
       virtual packages : __archspec=1=zen2
                          __conda=23.11.0=0
                          __cuda=12.2=0
                          __glibc=2.31=0
                          __linux=5.8.0=0
                          __unix=0=0
       base environment : /home/leof/miniforge3  (writable)
      conda av data dir : /home/leof/miniforge3/etc/conda
  conda av metadata url : None
           channel URLs : https://conda.anaconda.org/conda-forge/linux-64
                          https://conda.anaconda.org/conda-forge/noarch
          package cache : /home/leof/miniforge3/pkgs
                          /home/leof/.conda/pkgs
       envs directories : /home/leof/miniforge3/envs
                          /home/leof/.conda/envs
               platform : linux-64
             user-agent : conda/23.11.0 requests/2.28.2 CPython/3.9.16 Linux/5.8.0-53-generic ubuntu/20.04.2 glibc/2.31 solver/libmamba conda-libmamba-solver/23.12.0 libmambapy/1.5.6
                UID:GID : 1019:1019
             netrc file : None
           offline mode : False
@leofang leofang added the question Further information is requested label Jan 26, 2024
@leofang
Copy link
Member Author

leofang commented Jan 26, 2024

I suspect this is because cuda-version is now a transitive dependency to cupy (through cupy-core) instead of direct dependency, and we hit a dark corner of the solver.

@bdice
Copy link
Contributor

bdice commented Jan 26, 2024

Yes, I think cuda-version being a transitive dependency may be affecting this, and the cucim issue linked above. Can we add cuda-version as a dependency of cupy and not just cupy-core?

@leofang
Copy link
Member Author

leofang commented Jan 26, 2024

I can do that, but to me this solver behavior is questionable (it should attempt to ensure all dependencies are updated, at least when creating a new env without any existing installation), and I'd like to confirm there's no other way to resolve this before applying any WARs.

@jameslamb
Copy link
Member

@leofang @bdice @jakirkham I believe I just observed something similar in cuspatial's CI: rapidsai/cuspatial#1320 (comment)

@leofang
Copy link
Member Author

leofang commented Jan 26, 2024

I'd like to confirm there's no other way to resolve this before applying any WARs.

Asking in the CF Gitter channel: https://matrix.to/#/!SOyumkgPRWoXfQYIFH:matrix.org/$170628429022aRuJZ:gitter.im?via=matrix.org&via=gitter.im&via=cadair.com

@leofang
Copy link
Member Author

leofang commented Jan 27, 2024

No response to my Gitter thread, but I do find this Q&A touching on similar points:
https://conda.github.io/conda-libmamba-solver/user-guide/libmamba-vs-classic/#cudatoolkit-present-in-a-cpuonly-environment

This can be solved at the packaging level, where all the variants rely on the package mutex directly, instead of relying on packages that depend on the mutex.

@leofang
Copy link
Member Author

leofang commented Jan 27, 2024

I can confirm locally that --solver classic and --solver libmamba gives different outcomes. What we see is the result of libmamba. Since technically we packaged cupy/cupy-core right, and the Mamba documentation does warn users to be as explicit as possible:

Explicit is better

I am not sure if we want to apply any workaround. Thoughts? @bdice @jameslamb @jakirkham?

@bdice
Copy link
Contributor

bdice commented Jan 27, 2024

I would apply the workaround by pinning cuda-version in cupy, and add a comment that says it helps ensure libmamba finds the right versions without unwanted upgrades.

@jakirkham
Copy link
Member

jakirkham commented Jan 29, 2024

Thanks everyone for the discussion and Leo for the fix! 🙏

As this is a totally new package structure for cupy, think we are still learning what right looks like for this feedstock based on user feedback. This being just one instance of that feedback

It's worth noting in the RAPIDS use case, our CI runs do explicitly specify the cuda-version in an environment file. Would hope the solver treats this as explicitly requested, but what these two issues show is that this is not happening. Maybe cuda-version is only treated as explicitly requested when specified via command line, which seems a bit odd from a user perspective

In any event the risk in adding cuda-version is low and the value is notable, so this seems like a good resolution here

@leofang
Copy link
Member Author

leofang commented Jan 29, 2024

Thanks, John!

It's worth noting in the RAPIDS use case, our CI runs do explicitly specify the cuda-version in an environment file.

This is a bit surprising. Based on this thread I was under a different impression that the RAPIDS CIs haven't enforced it. Would be nice to check how things look like now that the fix was merged.

@leofang
Copy link
Member Author

leofang commented Jan 29, 2024

Confirmed locally that the fix is working. Now both solvers behave consistently (and as intended -- newer cuda-version is picked when unspecified).

@jakirkham
Copy link
Member

Would be nice to check how things look like now that the fix was merged.

Have restart CI in these (and dropped any prior workarounds)

@jakirkham
Copy link
Member

This is a bit surprising. Based on this thread I was under a different impression that the RAPIDS CIs haven't enforced it. Would be nice to check how things look like now that the fix was merged.

Have added another comment in that thread

What I learned from looking at that again is the solver actually recognizes cuda-version=11.8 as explicitly requested in that case. Then goes ahead and ignores it later 🤷‍♂️

@hmaarrfk
Copy link
Contributor

I'm not sure if such a package would be appreciated at conda-forge:

{% set version = "1.0" %}

package:
  name: nocuda11
  version: {{ version }}


build:
  number: 0
  noarch: generic

requirements:
  run_constrained:
    - cuda-version !=11.*

test:
  commands:
    - echo "no tests for this package"

but it is helping us internally

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants