Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataframe v2: new and improved chunk tools #7649

Merged
merged 2 commits into from
Oct 10, 2024
Merged

Conversation

teh-cmc
Copy link
Member

@teh-cmc teh-cmc commented Oct 9, 2024

Bunch of improvements and/or additions to the Chunk toolbox that happened as part of the implementation of the dataframe v2 API.

Checklist

  • I have read and agree to Contributor Guide and the Code of Conduct
  • I've included a screenshot or gif (if applicable)
  • I have tested the web demo (if applicable):
  • The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG
  • If applicable, add a new check to the release checklist!
  • If have noted any breaking changes to the log API in CHANGELOG.md and the migration guide

To run all checks from main, comment on the PR with @rerun-bot full-check.

@teh-cmc teh-cmc added enhancement New feature or request 🔍 re_query affects re_query itself include in changelog labels Oct 9, 2024
crates/store/re_chunk/src/chunk.rs Show resolved Hide resolved
/// WARNING: the returned chunk has the same old [`crate::ChunkId`]! Change it with [`Self::with_id`].
#[must_use]
#[inline]
pub fn components_removed(self) -> Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chunk.without_components() reads more intuitively to me but I don't feel strongly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying (hard) to keep to the seemingly de-facto arrow standard of using past participles (I think that's what they're called?) for methods that take ownership, filter and return a new one.


/// Applies a [take] kernel to the [`Chunk`] as a whole.
///
/// In release builds, indices are allowed to have null entries (they will be taken as `null`s).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the situations that cause us to query with null indices? Seems like returning a ChunkResult here and always making that an error condition would be preferable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't, but this is technically part of the public Rust API, so I don't want to punish end users trying to do something that is perfectly valid and apparently well accepted in the broader ecosystem (whether its panics or results, they're both extremely annoying in these filter chains).

@teh-cmc teh-cmc merged commit ab69022 into main Oct 10, 2024
30 checks passed
@teh-cmc teh-cmc deleted the cmc/dataframev2_chunk_tools branch October 10, 2024 07:20
teh-cmc added a commit that referenced this pull request Oct 10, 2024
Support clear semantics in the dataframe API.
Tombstones are never visible to end-users, only their effect.

Like every other Dataframe v2 feature PR, and following recommendations
from @jleibs, this prioritizes convenience of implementation over
everything else, for now.
All clear chunks are fetched, post-processed, and re-injected into the
view contents during init(), and then the streaming join runs as usual
after that.

Static clear semantics can get pretty unhinged, but that's A) not
specific to the dataframe API and B) so extremely niche that our time is
better spent on real-world problems right now:
- #7650 
- #7631 

---

- Fixes #7495
- Fixes #7414
- Fixes #7468
- Fixes #7493
- DNM: requires #7649
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request include in changelog 🔍 re_query affects re_query itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants