Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement support for retaining Pandas index #6061

Merged
merged 48 commits into from
Apr 19, 2024
Merged

Conversation

philippjfr
Copy link
Member

@philippjfr philippjfr commented Jan 6, 2024

The Pandas support in HoloViews has always been severely hampered by the fact that we do not properly support indexes, i.e. if a user references an index column we are forced to call .reset_index() which is inefficient and breaks one core principle of HoloViews, which is that we simply provide thin wrappers around data. It also has significant performance implications both from a memory perspective and from a speed perspective since indexes can provide significant speedups when indexing or performing aggregations. One other major issue solved by supporting indexes is the problem of having multiple elements providing a view onto the same DataFrame, e.g. when you have an NdOverlay of curves each visualizing a different column.

Implements #2537

  • Implement all operations correctly
    • __init__
    • validate
    • dtype
    • dimension_type
    • range
    • iloc (check multi-index)
    • values (check multi-index)
    • sort
    • reindex
    • sample
  • Optimized operations
    • select
    • groupby
  • Check operations that use dataset.data directly do not make (now invalid) assumptions about indexes
  • Add tests

@philippjfr philippjfr added the type: enhancement Minor feature or improvement to an existing feature label Jan 6, 2024
@codecov-commenter
Copy link

codecov-commenter commented Jan 6, 2024

Codecov Report

Attention: Patch coverage is 20.06579% with 243 lines in your changes are missing coverage. Please review.

Project coverage is 26.96%. Comparing base (342d81c) to head (4546212).
Report is 15 commits behind head on main.

❗ Current head 4546212 differs from pull request most recent head 1476fac. Consider uploading reports for the commit 1476fac to get more accurate results

Files Patch % Lines
holoviews/tests/core/data/test_pandasinterface.py 21.17% 134 Missing ⚠️
holoviews/core/data/pandas.py 16.39% 102 Missing ⚠️
holoviews/core/data/ibis.py 41.66% 7 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #6061       +/-   ##
===========================================
- Coverage   88.68%   26.96%   -61.73%     
===========================================
  Files         316      318        +2     
  Lines       66072    67132     +1060     
===========================================
- Hits        58598    18104    -40494     
- Misses       7474    49028    +41554     
Flag Coverage Δ
ui-tests 26.96% <20.06%> (+3.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@droumis
Copy link
Member

droumis commented Jan 9, 2024

may partially resolve #6058

@hoxbro hoxbro marked this pull request as draft February 8, 2024 20:24
@hoxbro hoxbro marked this pull request as ready for review April 8, 2024 12:53
Copy link
Member Author

@philippjfr philippjfr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, left a few suggestions and questions for you.

@philippjfr
Copy link
Member Author

Okay, I'm happy with this PR personally. We really need to merge this and test it extensively in all kinds of scenarios over the next few weeks.

@hoxbro hoxbro enabled auto-merge (squash) April 19, 2024 12:45
@hoxbro hoxbro merged commit 39489c5 into main Apr 19, 2024
11 checks passed
@hoxbro hoxbro deleted the pandas_index_support branch April 19, 2024 13:22
@droumis droumis mentioned this pull request Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement Minor feature or improvement to an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants