-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Add an allowlist of DataTypes that ColumnRangeStatistics supports and validation of TableStatistics #1632
Conversation
jaychia
commented
Nov 17, 2023
•
edited
Loading
edited
- We should disallow creation of ColumnRangeStatistics from non-comparable types to avoid issues at runtime
- We also add validation when creating MicroPartitions:
- The column names in a MicroPartition's schema must be found in its ScanTask's schema
- When creating Statistics for a MicroPartition, we cast those Statistics to the MicroPartition's schema to ensure type compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! You do like your matching haha
…arsing from Parquet
43032c1
to
2286c38
Compare
{ | ||
panic!("MicroPartition: TableStatistics and Schema have different column names\nTableStats:\n{},\nSchema\n{}", statistics, schema); | ||
} | ||
assert!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be a panic or debug assert. regular assert here will kill the program
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert!
just calls panic!
under the hood, is it special-cased for pyo3?
https://doc.rust-lang.org/std/macro.assert.html
Asserts that a boolean expression is true at runtime.
This will invoke the panic! macro if the provided expression cannot be evaluated to true at runtime.