Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NullableArrays branch #64

Closed
ExpandingMan opened this issue Oct 31, 2016 · 3 comments
Closed

NullableArrays branch #64

ExpandingMan opened this issue Oct 31, 2016 · 3 comments

Comments

@ExpandingMan
Copy link
Contributor

Hello all. I'm currently using the new NullableArrays version of DataFrames, and have sort of been putting off making my own fork of DataFramesMeta for NullableArrays. For me personally, DataFramesMeta is by far the best solution for querying dataframes in most situations. Before I start work on a fork, has any work been done on this yet? Is there any roadmap for this? I don't see a branch or issue, so I assume the answer to both questions is no.

@tshort
Copy link
Contributor

tshort commented Oct 31, 2016

No and no as far as I know. I haven't done any work on this, yet.

@ExpandingMan
Copy link
Contributor Author

ExpandingMan commented Oct 31, 2016

Ok. When I first started looking at this I thought that solving this would be unbelievably complicated, but @davidagold pointed me to some code he wrote which does some expression processing to include lifting operations. Hopefully it's adaptable to this case, I'll look at it...

@ExpandingMan
Copy link
Contributor Author

I have created my own implementation of @where for NullableArrays. I haven't recreated the functionality of @with so right now this is sitting in some standalone functions in a repo I have where I keep various DataFrames utilities. My macro is called @constrain, you can see it here. The idea is basically to pass the macro generated functions that would normally get passed right to the dataframe to a helper function which handles the nulls for it. I probably haven't done anything that wouldn't be obvious to anyone else, but I haven't seen any implementations of this sort of tool for DataFrames with NullableArrays so I thought I'd post it here for posterity.

I'm still not getting the same performance as I get when I avoid metaprogramming altogether, but I have mostly fixed the multiple-dispatch slowness issues which plague the most naive implementation of this, see my function _dispatchConstrainFunc!, and the difference between the functions constrain_OLD and constrain (the latter has vastly better performance). I haven't checked whether the existing DataFramesMeta suffers from these sorts of dispatch issues, naively it seems to me that it would.

I haven't looked into how difficult it is to generalize this to @with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants