forked from rapidsai/cudf
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test #7
Open
galipremsagar
wants to merge
955
commits into
branch-24.06
Choose a base branch
from
test
base: branch-24.06
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
test #7
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This aligns the polars dependency with the most modern version supported by cudf-polars in this branch. Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - James Lamb (https://github.com/jameslamb) URL: rapidsai#16442
…16161) Introduces a new environment variable `LOG_FAST_FALLBACK` which will create a structured log of the call that failed. An example of the log is ``` INFO:root:{"debug_type": "LOG_FAST_FALLBACK", "failed_call": "pandas._libs.interval.Interval(0,1)", "exception": "Exception", "exception_message": "Cannot transform _Unusable", "pandas_object": "pandas._libs.interval.Interval", "passed_args": "0,1,", "passed_kwargs": {}} ``` I could turn this into a warning instead, but I imagine we would want to first utilize this to parse the failures and see generalized failures in aggregate Authors: - Matthew Roeschke (https://github.com/mroeschke) - GALI PREM SAGAR (https://github.com/galipremsagar) - Matthew Murray (https://github.com/Matt711) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) - Lawrence Mitchell (https://github.com/wence-) URL: rapidsai#16161
## Description Without `flatbuffers` being added to the conda environment `libcudf` is being built in is causing the following build failures: ``` In file included from /nvme/0/pgali/cudf/cpp/src/io/parquet/arrow_schema_writer.cpp:26: /nvme/0/pgali/cudf/cpp/src/io/parquet/ipc/Message_generated.h:6:10: fatal error: flatbuffers/flatbuffers.h: No such file or directory 6 | #include <flatbuffers/flatbuffers.h> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. ``` ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). - [ ] New or existing tests cover these changes. - [x] The documentation is up to date with these changes.
…apidsai#16440) ## Description Fixes internal parquet_field_list subclass constructors capturing invalid this pointer when passing objects to std::make_tuple. The std::make_tuple usage creates a parameter object that is constructed, moved, and destroyed. The this pointer is captured during constructor call. The move constructor is called which creates its own separate this pointer (all member data is moved/copied appropriately). The original this pointer is invalidated by the following destructor. The lambda that was captured in the constructor no longer contains a valid this value in the final moved object. This PR removes the dependency on the this pointer in the lambda and captures the vector reference instead which is preserved correctly in the object move. The ctor, move, dtor pattern occurs because of how std::make_tuple is implemented by the standard library. Closes rapidsai#16408 ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). - [x] New or existing tests cover these changes. - [x] The documentation is up to date with these changes.
…24.08 Fix merge conflict for auto merge 16447
Mostly transferring methods that were defined on `Series.dt` methods to `DatetimeColumn` so it could be reused in `DatetimeIndex` Authors: - Matthew Roeschke (https://github.com/mroeschke) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Lawrence Mitchell (https://github.com/wence-) - GALI PREM SAGAR (https://github.com/galipremsagar) URL: rapidsai#16367
Mostly exposing methods that were available on the `TimedeltaColumn` Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) URL: rapidsai#16368
Implemented the relatively straightforward, missing APIs and raised `NotImplementedError` for the others Authors: - Matthew Roeschke (https://github.com/mroeschke) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) URL: rapidsai#16371
…sai#16441) This removes the need to `import cudf` in `test_column_from_device` and removes a runtime dependency on numpy in the associated pylibcudf column method. Authors: - https://github.com/brandon-b-miller - Thomas Li (https://github.com/lithomas1) Approvers: - Thomas Li (https://github.com/lithomas1) - Lawrence Mitchell (https://github.com/wence-) URL: rapidsai#16441
…apidsai#16436) rapidsai#16277 removed a universal cast to a `cupy.array` in `_from_array`. Although the typing suggested this method should only accept `np.ndarray` or `cupy.ndarray`, this method is called on any object implementing the `__cuda_array_inferface__` or `__array_interface__` (e.g. `numba.DeviceArray`) which caused a performance regression in cuspatial rapidsai/cuspatial#1413 closes rapidsai#16434 ```python In [1]: import cupy, numba.cuda In [2]: import cudf In [3]: cupy_array = cupy.ones((10_000, 100)) In [4]: %timeit cudf.DataFrame(cupy_array) 3.88 ms ± 52 μs per loop (mean ± std. dev. of 7 runs, 100 loops each) In [5]: %timeit cudf.DataFrame(numba.cuda.to_device(cupy_array)) 3.99 ms ± 35.4 μs per loop (mean ± std. dev. of 7 runs, 100 loops each) ``` --------- Co-authored-by: Bradley Dice <[email protected]>
Forward-merge branch-24.08 into branch-24.10
closes rapidsai#15144 Authors: - Thomas Li (https://github.com/lithomas1) - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Muhammad Haseeb (https://github.com/mhaseeb123) - Lawrence Mitchell (https://github.com/wence-) URL: rapidsai#16214
…rnel (rapidsai#16212) Improves the performance of `nvtext::hash_character_ngrams` using a warp-per-string kernel instead of a string per thread. Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Yunsong Wang (https://github.com/PointKernel) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: rapidsai#16212
PR to help prepare for the splitting out of pylibcudf. Authors: - Thomas Li (https://github.com/lithomas1) Approvers: - Bradley Dice (https://github.com/bdice) URL: rapidsai#16468
…` when type is known (rapidsai#16470) When we need to construct a column with a specific type, we do not need to go through the indirection of `build_column`, which matches a column subclass to a passed type, and instead construct directly from the class instead Authors: - Matthew Roeschke (https://github.com/mroeschke) Approvers: - Thomas Li (https://github.com/lithomas1) URL: rapidsai#16470
This PR fixes a small typo in the C++ code. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Yunsong Wang (https://github.com/PointKernel) - David Wendt (https://github.com/davidwendt) URL: rapidsai#16473
Noticed a few missing pylibcudf string docs that were missed, added them here. Authors: - https://github.com/brandon-b-miller - Thomas Li (https://github.com/lithomas1) Approvers: - Thomas Li (https://github.com/lithomas1) URL: rapidsai#16471
…der (rapidsai#16281) Under some situations in the Parquet reader (particularly the case with tables containing many columns or deeply nested column) we burn a decent amount of time doing cudaMemset() operations on output buffers. A good amount of this overhead seems to stem from the fact that we're simply launching many tiny kernels. This PR adds a batched memset kernel that takes a list of device spans as a single input and does all the work under a single kernel launch. This PR addresses issue rapidsai#15773 ## Improvements Using out performance cluster, improvements of 2.39% were shown on running the overall NDS queries Additionally, benchmarks were added showing big improvements(around 20%) especially on fixed width data types which can be shown below data_type | num_cols | cardinality | run_length | bytes_per_second_before_this_pr | bytes_per_second_after_this_pr | speedup --- | --- | --- | --- | --- | --- | --- INTEGRAL | 1000 | 0 | 1 | 36514934834 | 42756531566 | 1.170932709 INTEGRAL | 1000 | 1000 | 1 | 35364061247 | 39112512476 | 1.105996062 INTEGRAL | 1000 | 0 | 32 | 37349112510 | 39641370858 | 1.061373837 INTEGRAL | 1000 | 1000 | 32 | 39167079622 | 43740824957 | 1.116775245 FLOAT | 1000 | 0 | 1 | 51877322003 | 64083898838 | 1.235296973 FLOAT | 1000 | 1000 | 1 | 48983612272 | 58705522023 | 1.198472699 FLOAT | 1000 | 0 | 32 | 46544977658 | 53715018581 | 1.154045426 FLOAT | 1000 | 1000 | 32 | 54493432148 | 66617609904 | 1.22248879 DECIMAL | 1000 | 0 | 1 | 47616412888 | 57952310685 | 1.217065864 DECIMAL | 1000 | 1000 | 1 | 47166138095 | 54283772484 | 1.1509056 DECIMAL | 1000 | 0 | 32 | 45266163387 | 53770390830 | 1.18787162 DECIMAL | 1000 | 1000 | 32 | 52292176603 | 58847723569 | 1.125363819 TIMESTAMP | 1000 | 0 | 1 | 50245415328 | 60797982330 | 1.210020495 TIMESTAMP | 1000 | 1000 | 1 | 50300238706 | 60810368331 | 1.208947908 TIMESTAMP | 1000 | 0 | 32 | 55338354243 | 66786275739 | 1.206871376 TIMESTAMP | 1000 | 1000 | 32 | 55680028082 | 69029227374 | 1.23974843 DURATION | 1000 | 0 | 1 | 54680007758 | 66855201896 | 1.222662626 DURATION | 1000 | 1000 | 1 | 54305832171 | 66602436269 | 1.226432477 DURATION | 1000 | 0 | 32 | 60040760815 | 72663056969 | 1.210228784 DURATION | 1000 | 1000 | 32 | 60212221703 | 75646396131 | 1.256329595 STRING | 1000 | 0 | 1 | 29691707753 | 33388700976 | 1.12451265 STRING | 1000 | 1000 | 1 | 31411129876 | 35407241037 | 1.127219593 STRING | 1000 | 0 | 32 | 29680479388 | 33382478907 | 1.124728427 STRING | 1000 | 1000 | 32 | 35476213777 | 40478389269 | 1.141000827 LIST | 1000 | 0 | 1 | 6874253484 | 7370835717 | 1.072237987 LIST | 1000 | 1000 | 1 | 6763426009 | 7253762966 | 1.07249831 LIST | 1000 | 0 | 32 | 6981508808 | 7502741115 | 1.074658977 LIST | 1000 | 1000 | 32 | 6989374761 | 7506418252 | 1.073975643 STRUCT | 1000 | 0 | 1 | 2137525922 | 2189495762 | 1.024313081 STRUCT | 1000 | 1000 | 1 | 1057923939 | 1078475980 | 1.019426766 STRUCT | 1000 | 0 | 32 | 1637342446 | 1698913790 | 1.037604439 STRUCT | 1000 | 1000 | 32 | 1057587701 | 1082539399 | 1.02359303 Authors: - Rahul Prabhu (https://github.com/sdrp713) - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - https://github.com/nvdbaranec - Muhammad Haseeb (https://github.com/mhaseeb123) - Kyle Edwards (https://github.com/KyleFromNVIDIA) - Bradley Dice (https://github.com/bdice) URL: rapidsai#16281
Closes rapidsai#16395 This PR resolves two types of compilation errors, allowing for successful builds with GCC 13: - replacing the `cuco_allocator` strong type with an alias to fix a new build time check with GCC 13 - removing `std::move` when returning a temporary Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - David Wendt (https://github.com/davidwendt) - Mark Harris (https://github.com/harrism) URL: rapidsai#16488
Fixes call to CUB `DeviceSegmentedSort::SortPairs` where the input and output indices pointed to the same temp memory. The documentation from https://nvidia.github.io/cccl/cub/api/structcub_1_1DeviceSegmentedSort.html#id8 indicates the `d_values_in` and `d_values_out` memory must not overlap so using the same pointer for both created invalid output in certain conditions. The internal function was implemented to expect the input values to be updated in-place. The fix uses separate device memory for the input and output indices. Closes rapidsai#16455 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: rapidsai#16463
…pidsai#16454) `cudf.Series` is a public constructor that happens to accept a private `ColumnBase` object. Many ops return Columns and is natural to want to reconstruct a `Series`. This PR adds a `SingleColumnFrame._from_column` classmethod for instances where we need to wrap a new column in an `Index` or `Series`. This constructor also passes some unneeded validation in `ColumnAccessor` and `Series` Authors: - Matthew Roeschke (https://github.com/mroeschke) - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - GALI PREM SAGAR (https://github.com/galipremsagar) URL: rapidsai#16454
Forward-merge branch-24.08 into branch-24.10
Add `stream` param to a bunch of stream compaction APIs. Authors: - Jayjeet Chakraborty (https://github.com/JayjeetAtGithub) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Nghia Truong (https://github.com/ttnghia) - Mark Harris (https://github.com/harrism) - Karthikeyan (https://github.com/karthikeyann) - Mike Wilson (https://github.com/hyperbolic2346) URL: rapidsai#16295
…rsion (rapidsai#16503) Contributes to rapidsai/build-planning#58. `scikit-build-core==0.10.0` was released today (https://github.com/scikit-build/scikit-build-core/releases/tag/v0.10.0), and wheel-building configurations across RAPIDS are incompatible with it. This proposes upgrading to that version and fixing configuration here in a way that: * is compatible with that new `scikit-build-core` version * takes advantage of the forward-compatibility mechanism (`minimum-version`) that `scikit-build-core` provides, to reduce the risk of needing to do this again in the future Authors: - James Lamb (https://github.com/jameslamb) Approvers: - https://github.com/jakirkham URL: rapidsai#16503
Add support for multiple new-line characters for BOL (`^` / `\A`) and EOL (`$` / `\Z`): - `\n` line-feed (already supported) - `\r` carriage-return - `\u0085` next line (NEL) - `\u2028` line separator - `\u2029` paragraph separator Reference rapidsai#15746 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Nghia Truong (https://github.com/ttnghia) - Navin Kumar (https://github.com/NVnavkumar) URL: rapidsai#15961
…aded reader (rapidsai#16809) Closes rapidsai#16758 This PR adds an `io_type` axis to the benchmarks in `PARQUET_MULTITHREAD_READER_NVBENCH` with `PINNED_BUFFER` as default value. More description at rapidsai#16758. Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Yunsong Wang (https://github.com/PointKernel) - David Wendt (https://github.com/davidwendt) - Tianyu Liu (https://github.com/kingcrimsontianyu) URL: rapidsai#16809
This PR enables support for two features: - `python -m cudf.pandas` gives a REPL experience (previously it raised an error) - `python -m cudf.pandas -c "<commands>"` runs the provided commands (previously unsupported) Authors: - Bradley Dice (https://github.com/bdice) - Matthew Murray (https://github.com/Matt711) Approvers: - Matthew Murray (https://github.com/Matt711) URL: rapidsai#16428
…rapidsai#16751) Related to rapidsai#16750 This PR adds a benchmark to study read throughput of Parquet reader for wide tables. Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Paul Mattione (https://github.com/pmattione-nvidia) - Vukasin Milovanovic (https://github.com/vuule) URL: rapidsai#16751
There are two implementations of the same action; one in [rapidsai/shared-actions](https://github.com/rapidsai/shared-actions/tree/main/get-pr-info) and [the other](https://github.com/nv-gha-runners/get-pr-info) in the nv-gha-runners org. This PR switches to the implementation in the nv-gha-runners group in order to keep a single source of truth. Tested in https://github.com/rapidsai/cudf/actions/runs/10906617425/job/30268277178?pr=16819#step:4:5
… to inf (rapidsai#16750) Closes rapidsai#16733. This PR changes the default value of Parquet writer's default max row group size from 128MB to 1Million rows. This allows avoiding thin row group strips when writing wide (> 512 cols) tables resulting in a significantly improved read throughput for wide tables (especially when low cardinality) with virtually no impact to narrow-tables read performance. Benchmarked using: rapidsai#16751 ## Results ### Hardware ``` GPU: NVIDIA RTX 5880 Ada Generation SM Version: 890 (PTX Version: 860) Number of SMs: 110 SM Default Clock Rate: 18446744071874 MHz Global Memory: 23879 MiB Free / 48632 MiB Total Global Memory Bus Peak: 960 GB/sec (384-bit DDR @10001MHz) Max Shared Memory: 100 KiB/SM, 48 KiB/Block L2 Cache Size: 98304 KiB Maximum Active Blocks: 24/SM Maximum Active Threads: 1536/SM, 1024/Block Available Registers: 65536/SM, 65536/Block ECC Enabled: No ``` ### Read Throughput ``` ## parquet_read_wide_tables_mixed | T | num_rows | num_cols | GPU Time_old | GPU Time_new | bytes_per_second_old | bytes_per_second_new | peak_memory_usage_old | peak_memory_usage_new | encoded_file_size_old | encoded_file_size_new | |-----------|----------|----------|----------------|----------------|----------------------|----------------------|-----------------------|-----------------------|-----------------------|-----------------------| | INTEGRAL | 10000 | 64 | 940.690 us | 928.387 us | 570720378014 | 578283256754 | 3.405 MiB | 3.405 MiB | 748.248 KiB | 748.248 KiB | | INTEGRAL | 100000 | 64 | 2.053 ms | 2.037 ms | 261541794543 | 263500220325 | 28.308 MiB | 28.308 MiB | 5.164 MiB | 5.164 MiB | | INTEGRAL | 500000 | 64 | 5.783 ms | 5.693 ms | 92838553328 | 94296134644 | 139.928 MiB | 139.042 MiB | 24.698 MiB | 24.325 MiB | | INTEGRAL | 1000000 | 64 | 11.400 ms | 10.775 ms | 47092763803 | 49824643807 | 279.254 MiB | 277.470 MiB | 49.042 MiB | 48.284 MiB | | INTEGRAL | 10000 | 256 | 1.718 ms | 1.732 ms | 312407306091 | 309935794547 | 13.752 MiB | 13.752 MiB | 2.956 MiB | 2.956 MiB | | INTEGRAL | 100000 | 256 | 5.726 ms | 5.818 ms | 93765292338 | 92275580643 | 114.366 MiB | 114.366 MiB | 20.743 MiB | 20.743 MiB | | INTEGRAL | 500000 | 256 | 25.179 ms | 22.159 ms | 21322289603 | 24228371776 | 572.905 MiB | 561.786 MiB | 103.796 MiB | 97.677 MiB | | INTEGRAL | 1000000 | 256 | 48.259 ms | 42.428 ms | 11124725758 | 12653746472 | 1.117 GiB | 1.095 GiB | 206.155 MiB | 193.886 MiB | | INTEGRAL | 10000 | 512 | 2.741 ms | 2.758 ms | 195853280055 | 194632437549 | 27.508 MiB | 27.508 MiB | 5.918 MiB | 5.918 MiB | | INTEGRAL | 100000 | 512 | 11.197 ms | 10.600 ms | 47945685016 | 50646524148 | 235.910 MiB | 228.755 MiB | 44.559 MiB | 41.510 MiB | | INTEGRAL | 500000 | 512 | 54.929 ms | 43.554 ms | 9773962645 | 12326557981 | 1.146 GiB | 1.097 GiB | 221.266 MiB | 195.384 MiB | | INTEGRAL | 1000000 | 512 | 103.779 ms | 82.403 ms | 5173195193 | 6515218035 | 2.288 GiB | 2.190 GiB | 442.101 MiB | 387.861 MiB | | INTEGRAL | 10000 | 1024 | 5.210 ms | 5.405 ms | 103040438112 | 99319591295 | 54.937 MiB | 54.937 MiB | 11.829 MiB | 11.829 MiB | | INTEGRAL | 100000 | 1024 | 26.891 ms | 20.194 ms | 19964357393 | 26585391032 | 498.410 MiB | 456.756 MiB | 99.962 MiB | 82.939 MiB | | INTEGRAL | 500000 | 1024 | 135.404 ms | 84.676 ms | 3964957208 | 6340314329 | 2.434 GiB | 2.191 GiB | 500.554 MiB | 390.418 MiB | | INTEGRAL | 1000000 | 1024 | 256.033 ms | 162.217 ms | 2096879057 | 3309593393 | 4.869 GiB | 4.372 GiB | 1001.573 MiB | 775.040 MiB | | FLOAT | 10000 | 64 | 962.219 us | 951.565 us | 557950915640 | 564197923891 | 5.275 MiB | 5.275 MiB | 1012.101 KiB | 1012.101 KiB | | FLOAT | 100000 | 64 | 2.032 ms | 2.032 ms | 264218700681 | 264250413360 | 45.321 MiB | 45.321 MiB | 6.316 MiB | 6.316 MiB | | FLOAT | 500000 | 64 | 6.660 ms | 6.693 ms | 80611279094 | 80219014175 | 224.129 MiB | 222.946 MiB | 29.685 MiB | 29.044 MiB | | FLOAT | 1000000 | 64 | 13.560 ms | 13.758 ms | 39591771965 | 39023315442 | 447.103 MiB | 445.007 MiB | 58.762 MiB | 57.482 MiB | | FLOAT | 10000 | 256 | 1.808 ms | 1.825 ms | 297020886609 | 294226222306 | 21.109 MiB | 21.109 MiB | 3.968 MiB | 3.968 MiB | | FLOAT | 100000 | 256 | 6.921 ms | 6.307 ms | 77571490752 | 85116522574 | 185.578 MiB | 181.271 MiB | 27.393 MiB | 25.256 MiB | | FLOAT | 500000 | 256 | 30.064 ms | 25.955 ms | 17857874786 | 20684696586 | 914.366 MiB | 891.787 MiB | 128.981 MiB | 116.186 MiB | | FLOAT | 1000000 | 256 | 59.189 ms | 48.592 ms | 9070460126 | 11048464794 | 1.787 GiB | 1.738 GiB | 258.075 MiB | 229.920 MiB | | FLOAT | 10000 | 512 | 2.998 ms | 3.006 ms | 179078195058 | 178594968077 | 42.222 MiB | 42.222 MiB | 7.941 MiB | 7.941 MiB | | FLOAT | 100000 | 512 | 14.160 ms | 12.314 ms | 37915291403 | 43597041127 | 376.553 MiB | 362.567 MiB | 60.136 MiB | 50.537 MiB | | FLOAT | 500000 | 512 | 69.524 ms | 50.251 ms | 7722076774 | 10683715204 | 1.826 GiB | 1.742 GiB | 292.552 MiB | 232.393 MiB | | FLOAT | 1000000 | 512 | 130.729 ms | 95.458 ms | 4106742786 | 5624164002 | 3.647 GiB | 3.477 GiB | 581.180 MiB | 459.927 MiB | | FLOAT | 10000 | 1024 | 6.351 ms | 6.492 ms | 84532884515 | 82693769317 | 84.452 MiB | 84.452 MiB | 15.893 MiB | 15.893 MiB | | FLOAT | 100000 | 1024 | 36.898 ms | 26.302 ms | 14550146722 | 20411596018 | 778.441 MiB | 725.125 MiB | 136.809 MiB | 101.066 MiB | | FLOAT | 500000 | 1024 | 166.699 ms | 98.340 ms | 3220600409 | 5459311820 | 3.802 GiB | 3.484 GiB | 685.702 MiB | 464.775 MiB | | FLOAT | 1000000 | 1024 | 339.687 ms | 188.463 ms | 1580487011 | 2848673918 | 7.606 GiB | 6.953 GiB | 1.340 GiB | 919.840 MiB | | DECIMAL | 10000 | 64 | 1.076 ms | 1.092 ms | 498752693210 | 491676757508 | 7.485 MiB | 7.485 MiB | 1.216 MiB | 1.216 MiB | | DECIMAL | 100000 | 64 | 2.166 ms | 2.172 ms | 247840684988 | 247198078197 | 65.498 MiB | 65.498 MiB | 6.658 MiB | 6.658 MiB | | DECIMAL | 500000 | 64 | 7.421 ms | 7.058 ms | 72343289850 | 76066836305 | 325.515 MiB | 322.466 MiB | 31.349 MiB | 29.384 MiB | | DECIMAL | 1000000 | 64 | 15.239 ms | 14.020 ms | 35230516583 | 38291860266 | 649.547 MiB | 643.714 MiB | 61.759 MiB | 57.826 MiB | | DECIMAL | 10000 | 256 | 1.989 ms | 1.989 ms | 269930562597 | 269886680781 | 30.119 MiB | 30.119 MiB | 4.896 MiB | 4.896 MiB | | DECIMAL | 100000 | 256 | 7.839 ms | 6.966 ms | 68483613468 | 77073587059 | 269.638 MiB | 263.547 MiB | 30.588 MiB | 26.664 MiB | | DECIMAL | 500000 | 256 | 35.199 ms | 26.893 ms | 15252335676 | 19963411264 | 1.312 GiB | 1.267 GiB | 150.948 MiB | 117.601 MiB | | DECIMAL | 1000000 | 256 | 72.584 ms | 50.944 ms | 7396511691 | 10538553316 | 2.622 GiB | 2.529 GiB | 301.231 MiB | 231.353 MiB | | DECIMAL | 10000 | 512 | 3.612 ms | 3.595 ms | 148642296188 | 149335059500 | 60.283 MiB | 60.283 MiB | 9.801 MiB | 9.801 MiB | | DECIMAL | 100000 | 512 | 19.820 ms | 14.084 ms | 27087819156 | 38119174003 | 562.417 MiB | 527.494 MiB | 75.263 MiB | 53.349 MiB | | DECIMAL | 500000 | 512 | 94.913 ms | 51.910 ms | 5656452419 | 10342308581 | 2.747 GiB | 2.536 GiB | 377.112 MiB | 235.187 MiB | | DECIMAL | 1000000 | 512 | 180.513 ms | 98.562 ms | 2974131976 | 5447057883 | 5.494 GiB | 5.063 GiB | 754.738 MiB | 462.785 MiB | | DECIMAL | 10000 | 1024 | 7.667 ms | 6.777 ms | 70025338013 | 79218913933 | 120.656 MiB | 120.656 MiB | 19.616 MiB | 19.616 MiB | | DECIMAL | 100000 | 1024 | 61.182 ms | 26.946 ms | 8775038947 | 19923803470 | 1.184 GiB | 1.031 GiB | 201.928 MiB | 106.705 MiB | | DECIMAL | 500000 | 1024 | 261.218 ms | 102.314 ms | 2055261558 | 5247292283 | 5.921 GiB | 5.076 GiB | 1012.826 MiB | 470.402 MiB | | DECIMAL | 1000000 | 1024 | 513.386 ms | 196.347 ms | 1045744543 | 2734301880 | 11.843 GiB | 10.133 GiB | 1.980 GiB | 925.576 MiB | | TIMESTAMP | 10000 | 64 | 1.014 ms | 1.016 ms | 529606978079 | 528414399822 | 6.079 MiB | 6.079 MiB | 1.068 MiB | 1.068 MiB | | TIMESTAMP | 100000 | 64 | 2.057 ms | 2.053 ms | 261019684779 | 261455248599 | 52.688 MiB | 52.688 MiB | 6.436 MiB | 6.436 MiB | | TIMESTAMP | 500000 | 64 | 6.950 ms | 6.761 ms | 77245644716 | 79410211533 | 260.606 MiB | 259.304 MiB | 29.924 MiB | 29.164 MiB | | TIMESTAMP | 1000000 | 64 | 14.506 ms | 13.832 ms | 37010291008 | 38813599633 | 521.240 MiB | 517.604 MiB | 59.878 MiB | 57.601 MiB | | TIMESTAMP | 10000 | 256 | 1.878 ms | 1.889 ms | 285887176743 | 284275145551 | 24.328 MiB | 24.328 MiB | 4.290 MiB | 4.290 MiB | | TIMESTAMP | 100000 | 256 | 7.198 ms | 6.458 ms | 74586920018 | 83128450019 | 215.854 MiB | 210.739 MiB | 28.681 MiB | 25.734 MiB | | TIMESTAMP | 500000 | 256 | 34.185 ms | 26.654 ms | 15705060785 | 20142331826 | 1.044 GiB | 1.013 GiB | 137.016 MiB | 116.663 MiB | | TIMESTAMP | 1000000 | 256 | 66.420 ms | 49.599 ms | 8083007343 | 10824295857 | 2.085 GiB | 2.022 GiB | 272.580 MiB | 230.395 MiB | | TIMESTAMP | 10000 | 512 | 3.143 ms | 3.150 ms | 170821086658 | 170446277893 | 48.702 MiB | 48.702 MiB | 8.591 MiB | 8.591 MiB | | TIMESTAMP | 100000 | 512 | 17.652 ms | 12.615 ms | 30413872283 | 42557024194 | 440.115 MiB | 421.891 MiB | 63.197 MiB | 51.502 MiB | | TIMESTAMP | 500000 | 512 | 75.454 ms | 50.955 ms | 7115233856 | 10536117334 | 2.146 GiB | 2.028 GiB | 315.073 MiB | 233.355 MiB | | TIMESTAMP | 1000000 | 512 | 140.692 ms | 95.964 ms | 3815935506 | 5594485106 | 4.285 GiB | 4.048 GiB | 627.348 MiB | 460.885 MiB | | TIMESTAMP | 10000 | 1024 | 6.436 ms | 6.975 ms | 83411903593 | 76971777095 | 97.454 MiB | 97.454 MiB | 17.196 MiB | 17.196 MiB | | TIMESTAMP | 100000 | 1024 | 45.659 ms | 26.728 ms | 11758159876 | 20086145129 | 936.005 MiB | 844.159 MiB | 159.908 MiB | 103.000 MiB | | TIMESTAMP | 500000 | 1024 | 199.636 ms | 99.231 ms | 2689242353 | 5410303529 | 4.557 GiB | 4.057 GiB | 794.728 MiB | 466.703 MiB | | TIMESTAMP | 1000000 | 1024 | 372.691 ms | 192.598 ms | 1440523696 | 2787517681 | 9.104 GiB | 8.099 GiB | 1.551 GiB | 921.760 MiB | | DURATION | 10000 | 64 | 986.208 us | 989.153 us | 544379023579 | 542758221495 | 6.417 MiB | 6.417 MiB | 932.501 KiB | 932.501 KiB | | DURATION | 100000 | 64 | 2.222 ms | 2.018 ms | 241594183626 | 266034888500 | 57.291 MiB | 57.291 MiB | 6.079 MiB | 6.079 MiB | | DURATION | 500000 | 64 | 6.642 ms | 6.673 ms | 80830328889 | 80453377113 | 284.029 MiB | 283.224 MiB | 28.819 MiB | 28.288 MiB | | DURATION | 1000000 | 64 | 13.150 ms | 13.488 ms | 40828039129 | 39804805295 | 567.280 MiB | 565.669 MiB | 57.137 MiB | 56.075 MiB | | DURATION | 10000 | 256 | 1.805 ms | 1.815 ms | 297459887040 | 295856879191 | 25.686 MiB | 25.686 MiB | 3.665 MiB | 3.665 MiB | | DURATION | 100000 | 256 | 6.839 ms | 6.270 ms | 78502421937 | 85630914910 | 232.874 MiB | 229.165 MiB | 25.863 MiB | 24.323 MiB | | DURATION | 500000 | 256 | 29.886 ms | 26.234 ms | 17964080662 | 20464503730 | 1.125 GiB | 1.106 GiB | 123.885 MiB | 113.179 MiB | | DURATION | 1000000 | 256 | 58.290 ms | 48.418 ms | 9210348188 | 11088351436 | 2.250 GiB | 2.210 GiB | 247.272 MiB | 224.312 MiB | | DURATION | 10000 | 512 | 3.035 ms | 2.964 ms | 176885037888 | 181108374773 | 51.383 MiB | 51.383 MiB | 7.342 MiB | 7.342 MiB | | DURATION | 100000 | 512 | 14.492 ms | 12.136 ms | 37044853523 | 44237579412 | 474.355 MiB | 458.371 MiB | 55.996 MiB | 48.689 MiB | | DURATION | 500000 | 512 | 70.131 ms | 51.095 ms | 7655286246 | 10507294503 | 2.299 GiB | 2.213 GiB | 271.064 MiB | 226.438 MiB | | DURATION | 1000000 | 512 | 132.495 ms | 95.019 ms | 4051999205 | 5650150759 | 4.593 GiB | 4.419 GiB | 541.495 MiB | 448.815 MiB | | DURATION | 10000 | 1024 | 6.576 ms | 6.318 ms | 81638807422 | 84977253627 | 102.782 MiB | 102.782 MiB | 14.701 MiB | 14.701 MiB | | DURATION | 100000 | 1024 | 38.001 ms | 26.011 ms | 14127627316 | 20640219375 | 964.471 MiB | 916.755 MiB | 127.532 MiB | 97.394 MiB | | DURATION | 500000 | 1024 | 159.928 ms | 98.126 ms | 3356945213 | 5471258270 | 4.711 GiB | 4.426 GiB | 639.050 MiB | 452.925 MiB | | DURATION | 1000000 | 1024 | 305.818 ms | 188.647 ms | 1755524869 | 2845895428 | 9.422 GiB | 8.839 GiB | 1.249 GiB | 897.737 MiB | | STRING | 10000 | 64 | 2.241 ms | 2.244 ms | 239611491431 | 239240518530 | 15.926 MiB | 15.926 MiB | 2.075 MiB | 2.075 MiB | | STRING | 100000 | 64 | 4.862 ms | 4.822 ms | 110419679907 | 111346705245 | 132.646 MiB | 132.646 MiB | 8.087 MiB | 8.087 MiB | | STRING | 500000 | 64 | 20.498 ms | 17.812 ms | 26191957819 | 30140554720 | 664.294 MiB | 645.028 MiB | 40.456 MiB | 30.817 MiB | | STRING | 1000000 | 64 | 37.773 ms | 34.985 ms | 14213079575 | 15345709268 | 1.298 GiB | 1.255 GiB | 80.941 MiB | 59.259 MiB | | STRING | 10000 | 256 | 4.125 ms | 4.171 ms | 130163506067 | 128706550148 | 63.789 MiB | 63.789 MiB | 8.319 MiB | 8.319 MiB | | STRING | 100000 | 256 | 22.074 ms | 17.799 ms | 24321103825 | 30162947098 | 584.754 MiB | 530.912 MiB | 58.602 MiB | 32.330 MiB | | STRING | 500000 | 256 | 93.278 ms | 66.770 ms | 5755572906 | 8040584271 | 2.857 GiB | 2.521 GiB | 294.130 MiB | 123.271 MiB | | STRING | 1000000 | 256 | 190.999 ms | 122.359 ms | 2810851154 | 4387682165 | 5.715 GiB | 5.023 GiB | 588.586 MiB | 237.018 MiB | | STRING | 10000 | 512 | 7.520 ms | 8.010 ms | 71390390607 | 67021971176 | 127.538 MiB | 127.538 MiB | 16.634 MiB | 16.634 MiB | | STRING | 100000 | 512 | 51.666 ms | 32.251 ms | 10391219810 | 16646741143 | 1.259 GiB | 1.037 GiB | 173.940 MiB | 64.682 MiB | | STRING | 500000 | 512 | 251.723 ms | 125.963 ms | 2132782858 | 4262141577 | 6.300 GiB | 5.040 GiB | 873.437 MiB | 246.559 MiB | | STRING | 1000000 | 512 | 477.668 ms | 244.912 ms | 1123940871 | 2192101011 | 12.602 GiB | 10.044 GiB | 1.707 GiB | 474.121 MiB | | STRING | 10000 | 1024 | 17.184 ms | 16.128 ms | 31242201518 | 33288874029 | 276.395 MiB | 254.971 MiB | 40.126 MiB | 33.243 MiB | | STRING | 100000 | 1024 | 132.094 ms | 63.304 ms | 4064323158 | 8480799642 | 2.721 GiB | 2.073 GiB | 414.092 MiB | 129.316 MiB | | STRING | 500000 | 1024 | 608.283 ms | 251.026 ms | 882600977 | 2138709222 | 13.618 GiB | 10.076 GiB | 2.028 GiB | 493.067 MiB | | STRING | 1000000 | 1024 | 1.249 s | 485.734 ms | 429750505 | 1105276473 | 27.239 GiB | 20.079 GiB | 4.059 GiB | 948.185 MiB | ``` Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Nghia Truong (https://github.com/ttnghia) - Vukasin Milovanovic (https://github.com/vuule) - Bradley Dice (https://github.com/bdice) - Charles Blackmon-Luca (https://github.com/charlesbluca) URL: rapidsai#16750
This PR refactors `mixed_semi_join` by replacing **cuco** legacy `static_map` with latest `static_set`. Contributes to rapidsai#12261. Authors: - Srinivas Yadav (https://github.com/srinivasyadav18) - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Yunsong Wang (https://github.com/PointKernel) - Nghia Truong (https://github.com/ttnghia) URL: rapidsai#16230
All RAPIDS libraries have been updated with Python 3.12 support, so Python 3.12 changes have been merged into `branch-24.10` of `shared-workflows`: rapidsai/shared-workflows#213 This updates GitHub Actions configs here to that branch.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.