Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PERF] Iceberg Partition Pruning #1688

Merged
merged 35 commits into from
Dec 22, 2023
Merged

Conversation

samster25
Copy link
Member

@samster25 samster25 commented Dec 1, 2023

  • Implements Partition Transforms which map source fields to partition fields
  • Implements Partition Filtering when creating scan tasks
  • Implements Predicate to Partition Filter rewriting
  • Allow Iceberg Scan to leverage partition filtering
  • Implements EmptyScan which kicks in whenever we have no files to scan
  • Fixes bug with incorrect length when we have a predicate and a limit in a ScanTask
  • Fixes bug with df.num_partitions() where we didn't optimize the logical plan before computing the number of partitions

@github-actions github-actions bot added the enhancement New feature or request label Dec 1, 2023
Copy link

codecov bot commented Dec 16, 2023

Codecov Report

Attention: 42 lines in your changes are missing coverage. Please review.

Comparison is base (b4f3ae1) 85.19% compared to head (d6d7cc9) 84.76%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1688      +/-   ##
==========================================
- Coverage   85.19%   84.76%   -0.43%     
==========================================
  Files          55       55              
  Lines        5518     5554      +36     
==========================================
+ Hits         4701     4708       +7     
- Misses        817      846      +29     
Files Coverage Δ
daft/dataframe/dataframe.py 87.28% <100.00%> (ø)
daft/execution/physical_plan.py 93.40% <ø> (ø)
daft/io/scan.py 0.00% <0.00%> (ø)
daft/execution/rust_physical_plan_shim.py 93.10% <61.53%> (-5.57%) ⬇️
daft/iceberg/iceberg_scan.py 0.00% <0.00%> (ø)

@samster25 samster25 changed the title [FEAT] Sammy/iceberg partition transforms [PERF] Iceberg Partition Pruning Dec 21, 2023
@samster25 samster25 merged commit 94bb370 into main Dec 22, 2023
39 of 40 checks passed
@samster25 samster25 deleted the sammy/iceberg-partition-transforms branch December 22, 2023 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant