Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rollup functionality to frequency aggregates #696

Merged
merged 1 commit into from
Feb 7, 2023
Merged

Conversation

WireBaron
Copy link
Contributor

@WireBaron WireBaron commented Jan 31, 2023

This change adds rollup functions for frequency aggregates.

Note that while the rollup of a set of frequency aggregates will not necessarily be identical to computing a single aggregate over the underlying data (may not have the exact same upper and lower bounds on frequency), the rollup maintains the same invariants and will be able to identify most common elements as long as the frequency is different enough.

Fixes #685

counts: &[u64],
overcounts: &[u64],
) {
self.total_vals = val_count;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this one assert TEXTOID?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also wondering this, unless there's a reason for only checking it in ingest_aggregate_ints()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This same function is called by both text aggregates and raw aggregates (the AnyElement version of this). So the OID can really be anything here.

self.total_vals = val_count;

for (idx, datum) in values.iter().enumerate() {
self.entries.push(SpaceSavingEntry {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these assert entries is empty?

It seems these are just constructor part two; have you considered folding the calls in?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asserting entries is empty is probably a good idea. As for constructors, the problem is that there are a couple constructors and which one is used is completely independent of which call we use to fill them. So if I wanted to fold them in I'd have to have a function for every combination of calls.

Copy link
Contributor

@thatzopoulos thatzopoulos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have the one question about assert!, otherwise looks good to me

Copy link
Contributor

@epgts epgts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that pg15 failure might be a flake; I've had that test running for a few minutes now with no failures.

@WireBaron
Copy link
Contributor Author

I think that pg15 failure might be a flake; I've had that test running for a few minutes now with no failures.

That was my thought as well, that test shouldn't be affected by this change at all. Still, I'll take a closer look and see if we can make the test less flakey.

@WireBaron
Copy link
Contributor Author

bors r+

bors bot added a commit that referenced this pull request Feb 6, 2023
696: Add rollup functionality to frequency aggregates r=WireBaron a=WireBaron

This change implements adds in rollup functions for frequency aggregates.

Note that while the rollup of a set of frequency aggregates will not necessarily be identical to computing a single aggregate over the underlying data (may not have the exact same upper and lower bounds on frequency), the rollup maintains the same invariants and will be able to identify most common elements as long as the frequency is different enough.

Fixes #685 

Co-authored-by: Brian Rowe <[email protected]>
@bors
Copy link
Contributor

bors bot commented Feb 6, 2023

Build failed:

@WireBaron
Copy link
Contributor Author

bors r+

@bors
Copy link
Contributor

bors bot commented Feb 7, 2023

@bors bors bot merged commit cac9a9e into main Feb 7, 2023
@bors bors bot deleted the br/freq_rollup branch February 7, 2023 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rollup for freq_agg and topn_agg
3 participants