forked from rapidsai/cudf
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Remove support for skip_rows / num_rows options in the parquet reader. (
rapidsai#11503) Removes support for skip_rows / num_rows options in the parquet reader. Users retain control of what gets read via row groups. Did some before/after benchmarking. As expected, this doesn't change much except for a minor boost in list reading (due to simplification of the preprocessing step). Most of the ways the row bounds affected the code was in the page setup process (making it slippery to think through the logic) and didn't do much in the actual process of decoding. A selection of before/after benchmarks (all input files ~512 MB) ``` ParquetRead/integral_buffer_input/29/1000/32/0/1/manual_time Before: bytes_per_second=31.4564G/s After: bytes_per_second=31.58G/s ParquetRead/floats_buffer_input/31/1000/32/0/1/manual_time Before: bytes_per_second=49.2819G/s After: bytes_per_second=49.7408G/s ParquetRead/string_file_input/23/1000/32/0/0/manual_time Before: bytes_per_second=24.634G/s After: bytes_per_second=24.6563G/s ParquetRead/string_buffer_input/23/0/1/0/1/manual_time Before: bytes_per_second=5.03313G/s After: bytes_per_second=5.03535G/s ParquetRead/list_buffer_input/24/0/1/1/1/manual_time Before: bytes_per_second=1.11488G/s After: bytes_per_second=1.31447G/s ``` Authors: - https://github.com/nvdbaranec Approvers: - Mike Wilson (https://github.com/hyperbolic2346) - Yunsong Wang (https://github.com/PointKernel) URL: rapidsai#11503
- Loading branch information
1 parent
87a5e6a
commit d39b957
Showing
7 changed files
with
110 additions
and
646 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.