-
Notifications
You must be signed in to change notification settings - Fork 310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataframe v2: new and improved chunk tools #7649
Conversation
/// WARNING: the returned chunk has the same old [`crate::ChunkId`]! Change it with [`Self::with_id`]. | ||
#[must_use] | ||
#[inline] | ||
pub fn components_removed(self) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chunk.without_components()
reads more intuitively to me but I don't feel strongly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying (hard) to keep to the seemingly de-facto arrow standard of using past participles (I think that's what they're called?) for methods that take ownership, filter and return a new one.
|
||
/// Applies a [take] kernel to the [`Chunk`] as a whole. | ||
/// | ||
/// In release builds, indices are allowed to have null entries (they will be taken as `null`s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the situations that cause us to query with null indices? Seems like returning a ChunkResult
here and always making that an error condition would be preferable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't, but this is technically part of the public Rust API, so I don't want to punish end users trying to do something that is perfectly valid and apparently well accepted in the broader ecosystem (whether its panics or results, they're both extremely annoying in these filter chains).
Support clear semantics in the dataframe API. Tombstones are never visible to end-users, only their effect. Like every other Dataframe v2 feature PR, and following recommendations from @jleibs, this prioritizes convenience of implementation over everything else, for now. All clear chunks are fetched, post-processed, and re-injected into the view contents during init(), and then the streaming join runs as usual after that. Static clear semantics can get pretty unhinged, but that's A) not specific to the dataframe API and B) so extremely niche that our time is better spent on real-world problems right now: - #7650 - #7631 --- - Fixes #7495 - Fixes #7414 - Fixes #7468 - Fixes #7493 - DNM: requires #7649
Bunch of improvements and/or additions to the Chunk toolbox that happened as part of the implementation of the dataframe v2 API.
Checklist
main
build: rerun.io/viewernightly
build: rerun.io/viewerCHANGELOG.md
and the migration guideTo run all checks from
main
, comment on the PR with@rerun-bot full-check
.