Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StringArray columns returned when no pulling #435

Closed
bkamins opened this issue May 17, 2019 · 9 comments
Closed

StringArray columns returned when no pulling #435

bkamins opened this issue May 17, 2019 · 9 comments

Comments

@bkamins
Copy link
Member

bkamins commented May 17, 2019

By default when the column contains strings and pulling is disabled the type of returned column is WeakRefStrings.StringArray{String,1}. This has two consequences:

There are two things I would propose:

  • improve the documentation (@quinnj you probably should know what is best to write :))
  • discuss if we should provide some way to get a "normal" Vector{String} in the output.
@bkamins
Copy link
Member Author

bkamins commented May 17, 2019

Another problem with WeakRefStrings.StringArray{String,1} is that it does not support deleteat!, which is needed in filter! in the DataFrames.jl package.

@nalimilan
Copy link
Member

Another problem with WeakRefStrings.StringArray{String,1} is that it does not support deleteat!, which is needed in filter! in the DataFrames.jl package.

AFAIK this has just been fixed by JuliaData/WeakRefStrings.jl#61.

Regarding the memory issues, maybe StringArray should compress the buffer periodically from setindex! based on some heuristics?

@bkamins
Copy link
Member Author

bkamins commented May 17, 2019

Excellent!

@bkamins
Copy link
Member Author

bkamins commented May 18, 2019

@quinnj Here is a list of methods that we should support if possible to be consistent with Vector API:

  • deleteat!
  • insert!
  • splice!
  • prepend!

@nalimilan
Copy link
Member

pop! too?

@bkamins
Copy link
Member Author

bkamins commented May 20, 2019

Right. I missed it, because it does not have Vector in the signature in the Julia documentation and I did an automatic scan.

@quinnj
Copy link
Member

quinnj commented May 20, 2019

We need to make sure PooledArray/CategoricalArray support all those too, right?

@nalimilan
Copy link
Member

Yes. See JuliaData/PooledArrays.jl#23, JuliaData/CategoricalArrays.jl#192 and JuliaLang/julia#32065. It seems that deleteat! is the most commonly requested function because DataFrames needs it for filter! and dropmissing!.

@quinnj
Copy link
Member

quinnj commented Jul 10, 2019

I think the main issues here are all resolved now (missing mutation methods for custom arrays)

@quinnj quinnj closed this as completed Jul 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants