-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EM] Support mmap backed ellpack. #10602
Conversation
Comments. Use context. Work on cuda mmap. Close stream. Split up helpers. dispatch. Test flag. Cleanup. CPU. Dispatch. Windows. Leak. Cleanup. Fix. Windows. lint. windows. cleanup.
9ac6fea
to
a6a4c5d
Compare
cc @rongou |
dh::safe_cuda( | ||
cudaMemAdvise(handle_->base_ptr, handle_->base_size, cudaMemAdviseSetAccessedBy, device)); | ||
dh::safe_cuda( | ||
cudaMemPrefetchAsync(handle_->base_ptr, handle_->base_size, device, dh::DefaultStream())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this always happen after the data is initialized on the cpu?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for now. We might optionally only prefetch the gradient index part and leave other scalars on host.
* [coll] Allow using local host for testing. (#10526) - Don't try to retrieve the IP address if a host is specified. - Fix compiler deprecation warning. * Fix boolean array for arrow-backed DF. (#10527) * [EM] Move prefetch in reset into the end of the iteration. (#10529) * Enhance the threadpool implementation. (#10531) - Accept an initialization function. - Support void return tasks. * [doc] Update link to release notes. [skip ci] (#10533) * [doc] Fix learning to rank tutorial. [skip ci] (#10539) * Cache GPU histogram kernel configuration. (#10538) * [sycl] Reorder if-else statements to allow using of cpu branches for sycl-devices (#10543) * reoder if-else statements for sycl compatibility * trigger check --------- Co-authored-by: Dmitry Razdoburdin <> * [EM] Basic distributed test for external memory. (#10492) * [sycl] Improve build configuration. (#10548) Co-authored-by: Dmitry Razdoburdin <> * [R] Update roxygen. (#10556) * [doc] Add more detailed explanations for advanced objectives (#10283) --------- Co-authored-by: Jiaming Yuan <[email protected]> * [doc] Add `build_info` to autodoc. [skip ci] (#10551) * [doc] Add notes about RMM and device ordinal. [skip ci] (#10562) - Remove the experimental tag, we have been running it for a long time now. - Add notes about avoiding set CUDA device. - Add link in parameter. * Fix empty partition. (#10559) * Avoid the use of size_t in the partitioner. (#10541) - Avoid the use of size_t in the partitioner. - Use `Span` instead of `Elem` where `node_id` is not needed. - Remove the `const_cast`. - Make sure the constness is not removed in the `Elem` by making it reference only. size_t is implementation-defined, which causes issue when we want to pass pointer or span. * [EM] Handle base idx in GPU histogram. (#10549) * [fed] Split up federated test CMake file. (#10566) - Collect all federated test files into the same directory. - Independently list the files. * Avoid thrust vector initialization. (#10544) * Avoid thrust vector initialization. - Add a wrapper for rmm device uvector. - Split up the `Resize` method for HDV. * Fix column split race condition. (#10572) * Small cleanup for CMake scripts. (#10573) - Remove rabit. * replace channel for sycl dependencies (#10576) Co-authored-by: Dmitry Razdoburdin <> * Bump org.apache.maven.plugins:maven-project-info-reports-plugin (#10497) Bumps [org.apache.maven.plugins:maven-project-info-reports-plugin](https://github.com/apache/maven-project-info-reports-plugin) from 3.5.0 to 3.6.1. - [Commits](apache/maven-project-info-reports-plugin@maven-project-info-reports-plugin-3.5.0...maven-project-info-reports-plugin-3.6.1) --- updated-dependencies: - dependency-name: org.apache.maven.plugins:maven-project-info-reports-plugin dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump org.apache.flink:flink-clients in /jvm-packages (#10517) Bumps [org.apache.flink:flink-clients](https://github.com/apache/flink) from 1.19.0 to 1.19.1. - [Commits](apache/flink@release-1.19.0...release-1.19.1) --- updated-dependencies: - dependency-name: org.apache.flink:flink-clients dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump org.apache.maven.plugins:maven-surefire-plugin (#10429) Bumps [org.apache.maven.plugins:maven-surefire-plugin](https://github.com/apache/maven-surefire) from 3.2.5 to 3.3.0. - [Release notes](https://github.com/apache/maven-surefire/releases) - [Commits](apache/maven-surefire@surefire-3.2.5...surefire-3.3.0) --- updated-dependencies: - dependency-name: org.apache.maven.plugins:maven-surefire-plugin dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump commons-logging:commons-logging in /jvm-packages/xgboost4j-spark (#10547) Bumps commons-logging:commons-logging from 1.3.2 to 1.3.3. --- updated-dependencies: - dependency-name: commons-logging:commons-logging dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jiaming Yuan <[email protected]> * Bump org.apache.maven.plugins:maven-jar-plugin (#10458) Bumps [org.apache.maven.plugins:maven-jar-plugin](https://github.com/apache/maven-jar-plugin) from 3.4.1 to 3.4.2. - [Release notes](https://github.com/apache/maven-jar-plugin/releases) - [Commits](apache/maven-jar-plugin@maven-jar-plugin-3.4.1...maven-jar-plugin-3.4.2) --- updated-dependencies: - dependency-name: org.apache.maven.plugins:maven-jar-plugin dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump org.apache.maven.plugins:maven-project-info-reports-plugin (#10585) Bumps [org.apache.maven.plugins:maven-project-info-reports-plugin](https://github.com/apache/maven-project-info-reports-plugin) from 3.6.1 to 3.6.2. - [Commits](apache/maven-project-info-reports-plugin@maven-project-info-reports-plugin-3.6.1...maven-project-info-reports-plugin-3.6.2) --- updated-dependencies: - dependency-name: org.apache.maven.plugins:maven-project-info-reports-plugin dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump org.apache.maven.plugins:maven-release-plugin (#10586) Bumps [org.apache.maven.plugins:maven-release-plugin](https://github.com/apache/maven-release) from 3.0.1 to 3.1.1. - [Release notes](https://github.com/apache/maven-release/releases) - [Commits](apache/maven-release@maven-release-3.0.1...maven-release-3.1.1) --- updated-dependencies: - dependency-name: org.apache.maven.plugins:maven-release-plugin dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump net.alchim31.maven:scala-maven-plugin in /jvm-packages/xgboost4j (#10536) Bumps net.alchim31.maven:scala-maven-plugin from 4.9.1 to 4.9.2. --- updated-dependencies: - dependency-name: net.alchim31.maven:scala-maven-plugin dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump org.apache.maven.plugins:maven-checkstyle-plugin in /jvm-packages (#10518) Bumps [org.apache.maven.plugins:maven-checkstyle-plugin](https://github.com/apache/maven-checkstyle-plugin) from 3.3.1 to 3.4.0. - [Commits](apache/maven-checkstyle-plugin@maven-checkstyle-plugin-3.3.1...maven-checkstyle-plugin-3.4.0) --- updated-dependencies: - dependency-name: org.apache.maven.plugins:maven-checkstyle-plugin dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [R] Redesigned `xgboost()` interface skeleton (#10456) --------- Co-authored-by: Michael Mayer <[email protected]> * [jvm-packages] Bump rapids version. (#10588) * Bump scalatest.version from 3.2.18 to 3.2.19 in /jvm-packages/xgboost4j (#10535) Bumps `scalatest.version` from 3.2.18 to 3.2.19. Updates `org.scalatest:scalatest_2.12` from 3.2.18 to 3.2.19 - [Release notes](https://github.com/scalatest/scalatest/releases) - [Commits](scalatest/scalatest@release-3.2.18...release-3.2.19) Updates `org.scalactic:scalactic_2.12` from 3.2.18 to 3.2.19 - [Release notes](https://github.com/scalatest/scalatest/releases) - [Commits](scalatest/scalatest@release-3.2.18...release-3.2.19) --- updated-dependencies: - dependency-name: org.scalatest:scalatest_2.12 dependency-type: direct:development update-type: version-update:semver-patch - dependency-name: org.scalactic:scalactic_2.12 dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [Doc] Fix CRAN badge in README [skip ci] (#10587) * Change http to https in Badges * Change all http to https * Partial fix for CTK 12.5 (#10574) * Merge approx tests. (#10583) * [CI] Reduce the frequency of dependabot PRs (#10593) * Bump actions/setup-python from 5.1.0 to 5.1.1 (#10599) Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5.1.0 to 5.1.1. - [Release notes](https://github.com/actions/setup-python/releases) - [Commits](actions/setup-python@82c7e63...39cd149) --- updated-dependencies: - dependency-name: actions/setup-python dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump actions/upload-artifact from 4.3.3 to 4.3.4 (#10600) Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.3.3 to 4.3.4. - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](actions/upload-artifact@6546280...0b2256b) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump com.fasterxml.jackson.core:jackson-databind (#10590) Bumps [com.fasterxml.jackson.core:jackson-databind](https://github.com/FasterXML/jackson) from 2.15.2 to 2.17.2. - [Commits](https://github.com/FasterXML/jackson/commits) --- updated-dependencies: - dependency-name: com.fasterxml.jackson.core:jackson-databind dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Refactor `DeviceUVector`. (#10595) Create a wrapper instead of using inheritance to avoid inconsistent interface of the class. * [EM] Support mmap backed ellpack. (#10602) - Support resource view in ellpack. - Define the CUDA version of MMAP resource. - Define the CUDA version of malloc resource. - Refactor cuda runtime API wrappers, and add memory access related wrappers. - gather windows macros into a single header. * [CI] Fix test environment. (#10609) * [CI] Fix test environment. * Remove shell. * Remove. * Update Dockerfile.i386 * [CI] Build a CPU-only wheel under name `xgboost-cpu` (#10603) * Drop support for CUDA legacy stream. (#10607) * Optionally skip cupy on windows. (#10611) * [EM] Prevent init with CUDA malloc resource. (#10606) * Move device histogram storage into `histogram.cuh`. (#10608) * Fix. * Fix. --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Dmitry Razdoburdin <[email protected]> Co-authored-by: david-cortes <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Michael Mayer <[email protected]> Co-authored-by: RektPunk <[email protected]> Co-authored-by: Philip Hyunsu Cho <[email protected]>
This PR changes ellpack to use the resource view as the backing storage similar to CPU counterparts. This will help us enable more options for GPU external memory including
mmap
, normal malloc, device malloc, pinned memory, and managed memory. For now, only mmap is used. The PR supports both HMM and non-HMM systems.In addition, some refactoring to split up wrappers of the cuda runtime API to make them usable with a host compiler.
I haven't done any profiling yet as I need another branch to run real batch-based workflow, the madvise and prefetching are mostly placeholders. Will experiment with various backends after this PR.
To-do