Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement dplyr's relocate #2432

Closed
xiaodaigh opened this issue Sep 14, 2020 · 5 comments
Closed

Implement dplyr's relocate #2432

xiaodaigh opened this issue Sep 14, 2020 · 5 comments

Comments

@xiaodaigh
Copy link
Contributor

xiaodaigh commented Sep 14, 2020

I am trying to replicate dplyr Get started tutorial

There is a relocate function to re-order columns as needed.

So say if we implement this in DataFrames.jl we would have something like this

relocate(df, :sex=>:homeworld, before = :height) this will move all columns between :sex and :homeworld (inclusive) to the immediate left of the :height column

Not sure if this is appropriate in DataFrames or a DataFramesUtils.jl to keep DataFrames.jl lean and clean.

and the dplyr equivalent

starwars %>% relocate(sex:homeworld, .before = height)
#> # A tibble: 87 x 14
#>   name  sex   gender homeworld height  mass hair_color skin_color eye_color
#>   <chr> <chr> <chr>  <chr>      <int> <dbl> <chr>      <chr>      <chr>    
#> 1 Luke… male  mascu… Tatooine     172    77 blond      fair       blue     
#> 2 C-3PO none  mascu… Tatooine     167    75 <NA>       gold       yellow   
#> 3 R2-D2 none  mascu… Naboo         96    32 <NA>       white, bl… red      
#> 4 Dart… male  mascu… Tatooine     202   136 none       white      yellow   
#> # … with 83 more rows, and 5 more variables: birth_year <dbl>, species <chr>,
#> #   films <list>, vehicles <list>, starships <list>
@bkamins
Copy link
Member

bkamins commented Sep 14, 2020

Thank you for this comment.

This is not supported intentionally. We provide select and select! which allow to do it on a more low-level, e.g.:

select(starwars, 1:columnindex(starwars, :height)-1, Between(:sex, :homeworld), :)

or (it is easier to move things AFTER some column than BEFORE it)

select(starwars, Between(1,:name), Between(:sex, :homeworld), :)

Of course this is not the same, as e.g. it does not check if :height is in the range from :sex to :homeworld etc.

I would leave it for external packages to define such utility functions, to keep DataFrames.jl API minimal: the design principle is that DataFrames.jl should provide building blocks that are efficient and cover most common use cases. In this case by "most common use cases" we understood moving some columns to the front or to the back of a data frame, which can be done easily.

Therefore I am closing it, but please comment and I will reopen if you have a strongly different opinion.

@bkamins bkamins closed this as completed Sep 14, 2020
@EricForgy
Copy link

Have you considered creating an even more lightweight DataFramesCore.jl?

I would think DataFrames.jl would be the right place for these kinds of obvious convenience methods and if someone didn't want the sugar, they could use a more bare bones DataFramesCore.jl. It would seem a little silly to create a new separate package for this kind of thing.

I don't feel strongly about it, but it seems like this is a reasonable thing for DataFrames.jl to have.

@bkamins
Copy link
Member

bkamins commented Sep 14, 2020

see #1764

@nalimilan
Copy link
Member

Also we're currently trying to stabilize DataFrames 1.0 so we don't really have the bandwidth to design/develop/review utility methods at the moment. Designing a good Julian API takes a lot of work, so we'd rather defer this until we've completed the core.

@EricForgy
Copy link

Makes sense. Thank you both 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants