-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Date transformers casting dtypes if passed in as an input #142
Comments
You sometimes dont want your datetime columns to be casted to a new type even if that type is what the transformer requires, so your dates transformers could use a temporary copy of the column for the transformation |
Will look at this this sprint |
DateDiffLeapYearTransformer also raises an error incorrectly when passed a datetime object |
I've had a look at this and it looks like the conflicting dtypes can be quickly fixed by making a BaseDateTimeTransformer and adding a standardised datetime input check that all of the transformers use. This should make it much easier to put columns in the right format beforehand with a ToDateTimeTransformer. With regards to casting, I think this would be a lot more faff to do, create scope for a lot more hidden errors and make a lot of testing overhead, especially given that the copy datetime columns can be created using a to datetime transformer one step before in the pipeline. |
I've characterised the problem slightly more. Considering the two main datetime datatypes, datetime.date and datetime.datetime:
|
We have had some issues in the past with some tubular date transformers requiring conflicting specific dtypes to be accepted by the transformer, ie: datetime.object, datetime.date, datetime[n64].
The ExtractTimeInfo and DateDifferenceTransformer for example requiring different dtypes
I believe having an option to automatically cast the column to the required dtype as opposed to directly failing would be preferable
The text was updated successfully, but these errors were encountered: