Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breaking changes with DataFrames master #61

Closed
amellnik opened this issue Oct 3, 2016 · 6 comments
Closed

Breaking changes with DataFrames master #61

amellnik opened this issue Oct 3, 2016 · 6 comments
Assignees
Milestone

Comments

@amellnik
Copy link

amellnik commented Oct 3, 2016

I'm sure you're aware of this, but opening an issue to track. DataArrays was recently removed from DataFrames master and the default constructors now use Nullables.

using Query, DataFrames
WARNING: Method definition require(Symbol) in module Base at loading.jl:345 overwritten in module Query at C:\Users\Alex\.julia\v0.5\Requires\src\require.jl:12.
WARNING: Method definition ==(Base.Nullable{#T<:Any}, Base.Nullable{Union{}}) in module NullableArrays at C:\Users\Alex\.julia\v0.5\NullableArrays\src\operators.jl:131 overwritten in module Query at C:\Users\Alex\.julia\v0.5\Query\src\operators.jl:8.
WARNING: Method definition ==(Base.Nullable{Union{}}, Base.Nullable{#T<:Any}) in module NullableArrays at C:\Users\Alex\.julia\v0.5\NullableArrays\src\operators.jl:130 overwritten in module Query at C:\Users\Alex\.julia\v0.5\Query\src\operators.jl:9.
WARNING: Method definition ==(Base.Nullable{#T1<:Any}, Base.Nullable{#T2<:Any}) in module NullableArrays at C:\Users\Alex\.julia\v0.5\NullableArrays\src\operators.jl:121 overwritten in module Query at C:\Users\Alex\.julia\v0.5\Query\src\operators.jl:77.
WARNING: Error requiring DataFrames from Query:
ArgumentError: Module DataArrays not found in current path.
Run `Pkg.add("DataArrays")` to install the DataArrays package.

This also breaks Query blocks and @collect:

df = DataFrame(a=[2,1,1,2,1,3],b=[2,2,1,1,3,2])

x = @from i in df begin
    @orderby descending(i.a), i.b
    @select i
    @collect DataFrame
end
LoadError: MethodError: no method matching collect(::Query.EnumerableThenBy{Any,Query.EnumerableOrderby{Any,Query.EnumerableIterable{Any,DataFrames.DataFrame},FunctionWrappers.FunctionWrapper{Any,Tuple{Any}},Any},FunctionWrappers.FunctionWrapper{Any,Tuple{Any}},Any}, ::Type{DataFrames.DataFrame})
Closest candidates are:
  collect{T}(!Matched::Type{T}, ::Any) at array.jl:248
  collect{T<:NamedTuples.NamedTuple}(!Matched::Query.Enumerable{T<:NamedTuples.NamedTuple}, ::Type{DataFrames.DataFrame}) at C:\Users\Alex\.julia\v0.5\Query\src\sinks/sink_dataframe.jl:23
  collect{T}(::Query.Enumerable{T}) at C:\Users\Alex\.julia\v0.5\Query\src\sinks/sink_array.jl:2
  ...
while loading In[5], in expression starting on line 5
@davidanthoff
Copy link
Member

Yes, good to track this! The original implementation of Query.jl actually used the NullableArrays branch of DataFrames, and then I changed it back to DataArrays so that I could release the package. The change is really simple and I probably can just use the old code again. I'll prep a branch with this for now until DataFrames gets tagged.

@davidanthoff davidanthoff added this to the Backlog milestone Oct 3, 2016
@davidanthoff davidanthoff self-assigned this Oct 3, 2016
@amellnik
Copy link
Author

amellnik commented Oct 4, 2016

Here's an interesting discovery: Query is providing a very handy method getindex(::Nullable{String}, ::UnitRange{Int64}). I noticed that doing things like

using DataFrames, Query #Both at master
df = readtable("something.csv") #Has a string column called LongNames with many characters
df[:ShortNames] = [n[1:4] for n in df[:LongNames]]

This gives a no method error if Query is not included. Is this intended? I'm not sure this is a problem, but if this method is needed shouldn't it be in NullableArrays or Base? Just noting it here while mining issues in DataFrames master.

@davidanthoff
Copy link
Member

I'v added lots of lifted versions of base methods for Nullables. I hope that all of those definitions will move into base and out of Query. But without them it is really painful to use Query, so that is why they are there for now...

There is also some talk of creating a NullableOps package where these things could live until they are moved into base.

@davidanthoff
Copy link
Member

See #62.

@amellnik
Copy link
Author

amellnik commented Oct 4, 2016

@davidanthoff Amen to that, and thanks for putting them together. Is there a thread on a possible NullableOps package somewhere? I didn't see it on the mailing list or on google.

@davidanthoff
Copy link
Member

master now supports both DataFrames and DataTables, so I'm closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants