Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Implement all libcudf modules required by cuDF Python in pylibcudf #15162

Open
vyasr opened this issue Feb 27, 2024 · 0 comments
Open

[FEA] Implement all libcudf modules required by cuDF Python in pylibcudf #15162

vyasr opened this issue Feb 27, 2024 · 0 comments
Labels
feature request New feature or request pylibcudf Issues specific to the pylibcudf package

Comments

@vyasr
Copy link
Contributor

vyasr commented Feb 27, 2024

Is your feature request related to a problem? Please describe.
pylibcudf is intended to provide a low-level Python interface to the libcudf C++ API. cuDF's internals will ultimately be refactored to depend on pylibcudf. As a first step, we need to expose all libcudf algorithms used by cuDF Cython in pylibcudf.

Describe the solution you'd like
This is a tracking issue for APIs to expose in Cython. The APIs are grouped based on the pxd file exposing libcudf APIs in Cython, which roughly corresponds to namespaces in libcudf.

Module PRs (or assignees) Notes
aggregation.pxd #14945, #14970
binaryop.pxd #14821
column/column.pxd #13562 pylibcudf Columns share ownership
column/column_factories.pxd #15257
column/column_view.pxd #13562 pylibcudf Columns share ownership
concatenate.pxd #15011
contiguous_split.pxd #16953
copying.pxd #13562, #14508, #14640
datetime.pxd #15916 Note: started in #15916, but not finished
expressions.pxd #16056
filling.pxd #15225
groupby.pxd #14945
hash.pxd #15418
interop.pxd
io/arrow_io_source.pxd #16050
io/avro.pxd #15899
io/csv.pxd #16011 reader only, writer needs porting
io/data_sink.pxd
io/datasource.pxd #16050
io/json.pxd #15952 #15966
io/orc.pxd #16042
io/orc_metadata.pxd
io/parquet.pxd #16078 reader only, read parquet metadata/writer needs porting
io/text.pxd
io/timezone.pxd #16771
io/types.pxd
join.pxd #14972
labeling.pxd #16761
lists/combine.pxd #15928
lists/contains.pxd #15981
lists/count_elements.pxd #16072
lists/explode.pxd #15011
lists/extract.pxd #16071
lists/gather.pxd #16170
lists/lists_column_view.pxd #16175
lists/sorting.pxd #16179
lists/stream_compaction.pxd #16184
lists/reverse.pxd #16185
merge.pxd #15011
null_mask.pxd #15908
nvtext/byte_pair_encode.pxd
nvtext/edit_distance.pxd
nvtext/generate_ngrams.pxd
nvtext/jaccard.pxd
nvtext/minhash.pxd
nvtext/ngrams_tokenize.pxd
nvtext/normalize.pxd
nvtext/replace.pxd
nvtext/stemmer.pxd
nvtext/subword_tokenize.pxd
nvtext/tokenize.pxd
partitioning.pxd #16781
quantiles.pxd #15874
reduce.pxd #14970
replace.pxd #15005
reshape.pxd #15827 sans byte_cast which is only used by cpp/java
rolling.pxd #14982
round.pxd #15863
scalar/scalar.pxd #14133
search.pxd #15166
sorting.pxd #15011
stream_compaction.pxd #15011
strings/convert/convert_booleans.pxd
strings/convert/convert_datetime.pxd
strings/convert/convert_durations.pxd
strings/convert/convert_fixed_point.pxd
strings/convert/convert_floats.pxd
strings/convert/convert_integers.pxd
strings/convert/convert_ipv4.pxd
strings/convert/convert_lists.pxd
strings/convert/convert_urls.pxd
strings/split/partition.pxd #16940
strings/split/split.pxd #16940
strings/attributes.pxd #16785
strings/capitalize.pxd #15503
strings/case.pxd #15489
strings/char_types.pxd #16788
strings/combine.pxd #16790
strings/contains.pxd #16814
strings/extract.pxd #16823
strings/find.pxd #15604
strings/find_multiple.pxd #16920
strings/findall.pxd #16825
strings/json.pxd
strings/padding.pxd #16833
strings/regex_flags.pxd #15880
strings/regex_program.pxd #15880
strings/repeat.pxd #16834
strings/replace.pxd #15839
strings/replace_re.pxd
strings/side_type.pxd #16833
strings/strip.pxd #16833
strings/substring.pxd #15988
strings/translate.pxd #16934
strings/wrap.pxd #16935
strings_udf.pxd
table/table.pxd #13562 Tables share column ownership in pylibcudf
table/table_view.pxd #13562 Tables share column ownership in pylibcudf
transform.pxd #16760
transpose.pxd #16749
types.pxd #13562 More types added as needed in other PRs
unary.pxd #14850
utilities/host_span.pxd
wrappers/decimals.pxd
wrappers/durations.pxd
wrappers/timestamps.pxd
@vyasr vyasr added the feature request New feature or request label Feb 27, 2024
rapids-bot bot pushed a commit that referenced this issue Mar 2, 2024
Contributes to #15162

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #15166
rapids-bot bot pushed a commit that referenced this issue Mar 18, 2024
This PR also introduces `std::out_of_range` to cudf's code base in cases where it is appropriate.

Contributes to #12885 
Resolves #15315 
Contributes to #15162

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #15319
rapids-bot bot pushed a commit that referenced this issue Apr 11, 2024
This PR creates `pylibcudf` `case` APIs and migrates the cuDF cython to leverage them. Part of #15162.

Authors:
  - https://github.com/brandon-b-miller
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #15489
rapids-bot bot pushed a commit that referenced this issue May 24, 2024
@vyasr vyasr added the pylibcudf Issues specific to the pylibcudf package label May 28, 2024
rapids-bot bot pushed a commit that referenced this issue May 29, 2024
This PR creates the `pylibcudf.strings.capitalize` namespace and migrates the cuDF cython to use it. Depends on #15489

Part of #15162

Authors:
  - https://github.com/brandon-b-miller

Approvers:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #15503
rapids-bot bot pushed a commit that referenced this issue May 31, 2024
xref #15162

Migrate round.pxd to use pylibcudf APIs.

Authors:
  - Thomas Li (https://github.com/lithomas1)

Approvers:
  - https://github.com/brandon-b-miller
  - Lawrence Mitchell (https://github.com/wence-)

URL: #15863
rapids-bot bot pushed a commit that referenced this issue Jun 5, 2024
xref #15162

Change replace.pxd to use pylibcudf APIs.

Authors:
  - Thomas Li (https://github.com/lithomas1)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #15839
rapids-bot bot pushed a commit that referenced this issue Jun 6, 2024
This PR creates pylibcudf strings `contains` APIs and migrates the cuDF cython to leverage them. Part of #15162.

Authors:
  - https://github.com/brandon-b-miller

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)

URL: #15880
rapids-bot bot pushed a commit that referenced this issue Jun 6, 2024
xref #15162 

Starts migrating cudf I/O cython to use pylibcudf APIs, starting with avro.

Authors:
  - Thomas Li (https://github.com/lithomas1)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)

URL: #15899
rapids-bot bot pushed a commit that referenced this issue Jun 6, 2024
xref #15162 

Migrate quantile.pxd to use pylibcudf APIs.

Authors:
  - Thomas Li (https://github.com/lithomas1)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #15874
rapids-bot bot pushed a commit that referenced this issue Jun 12, 2024
Part of #15162. concatenate_rows, concatenate_list_elements

Authors:
  - Matthew Murray (https://github.com/Matt711)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Thomas Li (https://github.com/lithomas1)

URL: #15928
rapids-bot bot pushed a commit that referenced this issue Sep 19, 2024
rapids-bot bot pushed a commit that referenced this issue Sep 19, 2024
rapids-bot bot pushed a commit that referenced this issue Sep 19, 2024
rapids-bot bot pushed a commit that referenced this issue Sep 21, 2024
Contributes to #15162

One question is that I notice that the libcudf `compute_column` takes an expression computed by a routine in https://github.com/rapidsai/cudf/blob/branch-24.10/python/cudf/cudf/core/_internals/expressions.py. Does this need to be moved to pylibcudf too?

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #16760
rapids-bot bot pushed a commit that referenced this issue Sep 25, 2024
Contributes to #15162

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)
  - Matthew Murray (https://github.com/Matt711)

Approvers:
  - Lawrence Mitchell (https://github.com/wence-)
  - Matthew Murray (https://github.com/Matt711)

URL: #16825
rapids-bot bot pushed a commit that referenced this issue Sep 25, 2024
This PR is a first pass at #15937. We will close #15937 after #15162 is closed

Authors:
  - Matthew Murray (https://github.com/Matt711)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #16810
rjzamora pushed a commit to rjzamora/cudf that referenced this issue Sep 25, 2024
This PR is a first pass at rapidsai#15937. We will close rapidsai#15937 after rapidsai#15162 is closed

Authors:
  - Matthew Murray (https://github.com/Matt711)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: rapidsai#16810
Matt711 added a commit to mroeschke/cudf that referenced this issue Sep 25, 2024
This PR is a first pass at rapidsai#15937. We will close rapidsai#15937 after rapidsai#15162 is closed

Authors:
  - Matthew Murray (https://github.com/Matt711)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: rapidsai#16810
rapids-bot bot pushed a commit that referenced this issue Sep 25, 2024
rapids-bot bot pushed a commit that referenced this issue Sep 25, 2024
rapids-bot bot pushed a commit that referenced this issue Sep 25, 2024
Contributes to #15162

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)
  - Matthew Murray (https://github.com/Matt711)

Approvers:
  - Matthew Murray (https://github.com/Matt711)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #16785
rapids-bot bot pushed a commit that referenced this issue Sep 26, 2024
Contributes to #15162

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)
  - Vyas Ramasubramani (https://github.com/vyasr)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #16771
rapids-bot bot pushed a commit that referenced this issue Sep 26, 2024
Contributes to #15162

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)
  - Matthew Murray (https://github.com/Matt711)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Matthew Murray (https://github.com/Matt711)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #16781
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request pylibcudf Issues specific to the pylibcudf package
Projects
Status: In Progress
Status: Story Issue
Status: Functionality
Development

No branches or pull requests

1 participant