-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CHORE] Move out datatype and schema from daft-core #2806
Conversation
CodSpeed Performance ReportMerging #2806 will degrade performances by 37.79%Comparing Summary
Benchmarks breakdown
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2806 +/- ##
==========================================
- Coverage 63.30% 63.26% -0.05%
==========================================
Files 1007 1008 +1
Lines 114142 114189 +47
==========================================
- Hits 72262 72242 -20
- Misses 41880 41947 +67
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great!
Main question is why do we do use daft_core::prelude::{Schema, SchemaRef, DataType...}
in a bunch of places instead of using daft_schema
directly instead? And the corollary: why does daft_core::prelude
re-export a bunch of stuff from daft-schema::prelude
?
I also feel like we might want to narrow down the things in our prelude::*
a little more. I did find myself getting a little confused when it came to certain more "specialized" types like ImageFormat
when they just magically appeared and actually came from the daft-schema
prelude.
image::ImageFormat::Tiff => ImageFormat::TIFF, | ||
image::ImageFormat::Gif => ImageFormat::GIF, | ||
image::ImageFormat::Bmp => ImageFormat::BMP, | ||
_ => unimplemented!("Image format {:?} is not supported", image_format), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a TryFrom thing instead of panicking?
@@ -937,7 +937,7 @@ impl Utf8Array { | |||
pub fn to_datetime(&self, format: &str, timezone: Option<&str>) -> DaftResult<TimestampArray> { | |||
let len = self.len(); | |||
let self_iter = self.as_arrow().iter(); | |||
let timeunit = crate::datatypes::utils::infer_timeunit_from_format_string(format); | |||
let timeunit = daft_schema::time_unit::infer_timeunit_from_format_string(format); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like I prefer this than the prelude
pattern, which the datatypes
expose (e.g. felt a little confusing when I was trying to figure out where ImageFormat
came from earlier but then finally realized it came from datatypes::prelude
)
Maybe we need to find a happy medium of what should actually be on the prelude?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rule of thumb im going with is anything that is used in a public API of that module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the prelude is an optional opt-in, it's not automatically imported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rule of thumb im going with is anything that is used in a public API of that module.
Isn't that just use daft_core::*
? The *::prelude::*
pattern feels more like "please import all the stuff I absolutely will need to work with your crate"
For e.g. in pyo3
I think it's all the pyclass and pymethod stuff, but for more specific things we still do separate imports.
However, there's another common exception where wildcard imports make sense. Some crates have a convention that common items for the crate are re-exported from a prelude module, which is explicitly intended to be wildcard imported:
use thing::prelude::*;
Although in theory the same concerns apply in this case, in practice such a prelude module is likely to be carefully curated, and higher convenience may outweigh a small risk of future problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can take a look at datafusion-cores's prelude
Also happy to put things within a namespace under prelude.
Also looks like pyo3 puts a ton under their prelude (all the methods and traits)
https://github.com/PyO3/pyo3/blob/main/src/prelude.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also use daft_core::*
would just import all the modules that are public which is different behavior.
instead of Series, DataArray, ListArray
it would do the modules series
, array
and datatypes
which is not what we want.
src/daft-schema/src/dtype.rs
Outdated
daft_version: crate::VERSION.into(), | ||
daft_build_type: crate::DAFT_BUILD_TYPE.into(), | ||
// daft_version: crate::VERSION.into(), | ||
// daft_build_type: crate::DAFT_BUILD_TYPE.into(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this and why is it commented out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah these were declared in daft-core (Daft-version and build type). factoring these out to a common crate.
daft-schema
common-display
common-arrow-ffi
StrValue
trait that allows us to pass&dyn StrValue
to the common-display crate without it knowing what a series is.daft-schema::{DataType, Schema, Field}
for convenienceFollow on:
daft-schema