Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update fmt (to 11.0.2) and spdlog (to 1.14.1), add those libraries to libcuml conda host dependencies #6071

Merged
merged 19 commits into from
Sep 24, 2024

Conversation

jameslamb
Copy link
Member

@jameslamb jameslamb commented Sep 18, 2024

Description

Contributes to rapidsai/build-planning#56

Now that most of conda-forge has been updated to fmt >=11.0.1,<12 and spdlog>=1.14.1,<1.15 (rapidsai/build-planning#56 (comment)), we're attempting to upgrade RAPIDS to similar versions of those libraries.

This improves the likelihood that RAPIDS will be installable alongside newer versions of its
dependencies and complementary packages on conda-forge.

Notes for Reviewers

This PR is testing changes made in rapidsai/rapids-cmake#689.

@jameslamb jameslamb added 5 - DO NOT MERGE Hold off on merging; see PR for details improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Sep 18, 2024
@github-actions github-actions bot added the ci label Sep 18, 2024
@@ -4,6 +4,7 @@
set -euo pipefail

. /opt/conda/etc/profile.d/conda.sh
source ./ci/use_conda_packages_from_prs.sh
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not specific to this line, but just choosing a place to start a threaded conversation.

All conda-based tests are failing with issues like this:

Fatal Python error: Aborted
Thread 0x00007f1903eea640 (most recent call first):
  File "/opt/conda/envs/test/lib/python3.12/site-packages/execnet/gateway_base.py", line 534 in read
  File "/opt/conda/envs/test/lib/python3.12/site-packages/execnet/gateway_base.py", line 567 in from_io
  File "/opt/conda/envs/test/lib/python3.12/site-packages/execnet/gateway_base.py", line 1160 in _thread_receiver
  File "/opt/conda/envs/test/lib/python3.12/site-packages/execnet/gateway_base.py", line 341 in run
  File "/opt/conda/envs/test/lib/python3.12/site-packages/execnet/gateway_base.py", line 411 in _perform_spawn
Current thread 0x00007f1904b0e740 (most recent call first):
  File "/opt/conda/envs/test/lib/python3.12/site-packages/cuml/internals/api_decorators.py", line 190 in wrapper
...
Extension modules: cython.cimports.libc.math, Cython.Utils, numpy._core._multiarray_umath, numpy._core._multiarray_tests
...
.[gw13] node down: Not properly terminated

(build link)

I'll investigate this right now. I suspect it might be something related to symbol visibility (like rapidsai/cudf#15483 (comment)).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah!! It looks like either I missed some dependencies in the test scripts, or something else is holding back the fmt / spdlog version?

I see that RAPIDS libraries are coming from rapidsai-nightly and not the CI artifacts, and that the older versions of fmt and spdlog are being used.

...
cudf                      24.10.00a369    cuda11_py310_240923_g9b4c4c721c_369    rapidsai-nightly
cuml                      24.10.00a70     cuda11_py310_240923_ga1d76d4da_70    file:///tmp/python_channel
...
distributed-ucxx          0.40.00a        py3.10_240923_ga07a40f_28    rapidsai-nightly
...
fmt                       10.2.1               h00ab1b0_0    conda-forge
...

(build link)

Will try replicating that conda solve locally.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing locally, I found that if I added constraints like fmt>=11.0.2, spdlog>=1.14.1 in the conda environment, the solver was able to solve with all of cuml's runtime dependencies... including the versions of cudf, rmm, ucxx, etc. produced from the PRs for rapidsai/build-planning#56.

I think that libcuml conda packages need host: dependencies on spdlog and fmt.

That library's code directly uses both of those:

#include <spdlog/sinks/stdout_color_sinks.h> // NOLINT
#include <spdlog/spdlog.h> // NOLINT

std::string msg_string = fmt::to_string(formatted);

And having those dependencies would prevent these situations during development where the solver is able to choose to fall back to older RAPIDS nightlies from within the same release (but supporting different fmt / spdlog).

cudf is doing the same thing: https://github.com/rapidsai/cudf/blob/9b4c4c721c399bae9e88733da79daa1a10644481/conda/recipes/libcudf/meta.yaml#L69-L71

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably the right solution, thanks for investigating.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ended up not being enough 😭

Look at https://github.com/rapidsai/cuml/actions/runs/10998530224/job/30541086357?pr=6071.

The environment is still solving with fmt==10.2.1, spdlog==1.12.0, and older RAPIDS 24.10.* nightlies that support those (build link)... even though it used the cuml / libcuml packages from CI here.

cudf                      24.10.00a371    cuda11_py310_240923_g625590686f_371    rapidsai-nightly
cuml                      24.10.00a71     cuda11_py310_240923_g5b812f0e9_71    file:///tmp/python_channel
...
fmt                       10.2.1               h00ab1b0_0    conda-forge
...
libcudf                   24.10.00a371    cuda11_240923_g625590686f_371    rapidsai-nightly
...
libcuml                   24.10.00a71     cuda11_240923_g5b812f0e9_71    file:///tmp/cpp_channel
libcumlprims              24.10.00a       cuda11_240923_gddfb9dd_8    rapidsai-nightly
...
rmm                       24.10.00a40     cuda11_py310_240923_g58039d3c_40    rapidsai-nightly
...
spdlog                    1.12.0               hd2e6256_2    conda-forge
...

I just pushed another commit explicitly pinning fmt and spdlog in the conda test environments, to force the solver to use the newer packages.

This should be safe to do... as soon as there are librmm packages published to rapids-nightly containing the changes from rapidsai/rmm#1678, every environment containing the latest librmm / rmm will have those same pins in it.

If that works, that's evidence that the fmt / spdlog update is safe and we can start moving forward.

I think both of those changes are worth merging here

  • adding the host: dependencies = to make CPM unnecessary for spdlog and avoid unnecessary vendoring and therefore file clobbering
  • explicitly pinning in test environments = make such fallbacks during development less likely the next time we do an update like this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 I see tests now passing and fmt==11.0.2 + spdlog==1.14.1 getting installed!

Build link: https://github.com/rapidsai/cuml/actions/runs/11000193616/job/30544094053?pr=6071

Another thing that I think happened here.... it's been several days since the cudf CI artifacts being pulled in here were produced, so the rapidsai-nightly channel's versions are now newer.

e.g. rapidsai/cudf#16806 produces cudf==24.10.00a364 but rapidsai-nightly now contains cudf==24.10.00a371. That would also explain why some of the CI artifacts are being ignored, in favor of those from rapidsai-nightly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See here: #6071 (comment)

@github-actions github-actions bot added the conda conda issue label Sep 23, 2024
@jameslamb jameslamb changed the title WIP: [DO NOT MERGE] Update fmt (to 11.0.2) and spdlog (to 1.14.1). WIP: [DO NOT MERGE] Update fmt (to 11.0.2) and spdlog (to 1.14.1), add those libraries to libcuml conda dependencies Sep 23, 2024
Comment on lines +71 to +76
- fmt {{ fmt_version }}
- libcumlprims ={{ minor_version }}
- libraft ={{ minor_version }}
- libraft-headers ={{ minor_version }}
- librmm ={{ minor_version }}
- spdlog {{ spdlog_version }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are in the global list for the split recipe but also need to be in the libcuml package below in order to have run-exports on the appropriate packages. That would eliminate the need for test-time pinnings. I'm not sure which direction we want to go: do we add run-exported dependencies on these to "runtime" packages (these aren't "dev" packages) to enforce consistency, or use test-time pinnings to only ensure that we get compatible builds of librmm, etc. which are exporting these dependencies (and with which we need to avoid conflicts)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok! I'm not that familiar with the ways that split recipes are different from single-package ones, thank you for pointing this out.

I'm not sure precisely what you mean about those two choices... but I can say at least that I think cuml / libcuml should explicitly pin fmt and spdlog, given that it directly uses them: #6071 (comment) (instead of getting that pinning transitively from librmm or similar).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked offline and decided to add these pinnings in the top-level host environment and the build environment created from the common_build: list in dependencies.yaml, but to not mess with run exports or adding explicit run dependencies on fmt / spdlog.

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving but we may have a cleanup task related to host dependencies / run-exports.

@jameslamb jameslamb removed the 5 - DO NOT MERGE Hold off on merging; see PR for details label Sep 23, 2024
@jameslamb jameslamb changed the title WIP: [DO NOT MERGE] Update fmt (to 11.0.2) and spdlog (to 1.14.1), add those libraries to libcuml conda dependencies Update fmt (to 11.0.2) and spdlog (to 1.14.1), add those libraries to libcuml conda dependencies Sep 23, 2024
@jameslamb jameslamb changed the title Update fmt (to 11.0.2) and spdlog (to 1.14.1), add those libraries to libcuml conda dependencies Update fmt (to 11.0.2) and spdlog (to 1.14.1), add those libraries to libcuml conda host dependencies Sep 23, 2024
@github-actions github-actions bot removed the ci label Sep 23, 2024
@jameslamb jameslamb marked this pull request as ready for review September 23, 2024 23:04
@jameslamb jameslamb requested a review from a team as a code owner September 23, 2024 23:04
@jameslamb
Copy link
Member Author

🎉 this is working as expected

...
cudf                      24.10.00a373    cuda12_py312_240923_g8b12cf4e66_373    rapidsai-nightly
cuml                      24.10.00a78     cuda12_py312_240923_gc58738941_78    file:///tmp/python_channel
cupy                      13.3.0          py312h7d319b9_0    conda-forge
...
fmt                       11.0.2               h434a139_0    conda-forge
...
spdlog                    1.14.1               hed91bc2_1    conda-forge
...

(build link)

@jameslamb
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 932d3b1 into rapidsai:branch-24.10 Sep 24, 2024
62 of 63 checks passed
@jameslamb jameslamb deleted the fmt-and-spdlog branch September 24, 2024 03:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
conda conda issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants