Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Key::I64 and Key::U64 variants in aggregation #2468

Merged
merged 2 commits into from
Jul 31, 2024
Merged

Conversation

PSeitz
Copy link
Contributor

@PSeitz PSeitz commented Jul 31, 2024

Currently all Key numerical values are returned as f64. This causes problems in some
cases with the precision and the way f64 is serialized.

This PR adds Key::I64 and Key::U64 variants and uses them in the term
aggregation.

Fixes quickwit-oss/quickwit#5254

Currently all `Key` numerical values are returned as f64. This causes problems in some
cases with the precision and the way f64 is serialized.

This PR adds `Key::I64` and `Key::U64` variants and uses them in the term
aggregation.
@@ -339,9 +341,18 @@ fn get_missing_val(
Key::Str(_) if column_type == ColumnType::Str => Some(u64::MAX),
// Allow fallback to number on text fields
Key::F64(_) if column_type == ColumnType::Str => Some(u64::MAX),
Key::U64(_) if column_type == ColumnType::Str => Some(u64::MAX),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what get_missing_val is about.

It is clearly doing something strange. Do you remember?
Can you comment/rename to make it make sense?

Copy link
Contributor Author

@PSeitz PSeitz Jul 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, there is definitely a comment missing to explain it

/// Get the missing value as internal u64 representation
///
/// For terms we use u64::MAX as sentinel value
/// For numerical data we convert the value into the representation
/// we would get from the fast field, when we open it as u64_lenient_for_type.
///
/// That way we can use it the same way as if it would come from the fastfield.

@@ -191,6 +191,12 @@ impl SegmentCardinalityCollector {
let val = f64_to_u64(*val);
self.cardinality.sketch.insert_any(&val);
}
Key::U64(val) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for using this PR..

.expect("Found placeholder term_ord but missing is None");

I think we are just supposed to ignore the value.

@PSeitz PSeitz merged commit 0d4e319 into main Jul 31, 2024
4 checks passed
@PSeitz PSeitz deleted the fix_f64_truncation branch July 31, 2024 12:29
philippemnoel pushed a commit to paradedb/tantivy that referenced this pull request Aug 31, 2024
* add Key::I64 and Key::U64 variants in aggregation

Currently all `Key` numerical values are returned as f64. This causes problems in some
cases with the precision and the way f64 is serialized.

This PR adds `Key::I64` and `Key::U64` variants and uses them in the term
aggregation.

* add clarification comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Aggregations that return field values do not respect output format
2 participants