Separate `transmute()` from `mutate(.keep = "none")` #6086

DavisVaughan · 2021-11-16T18:15:58Z

In #6035 we reworked .keep to be more consistent for mutate(). I still stand by this decision in the context of mutate(), but this has changed the behavior of transmute(), since that is currently mutate(.keep = "none"). Here is a good example that shows the change in behavior for both group-vars and modified non-group vars.

library(dplyr)

df <- tibble(
  g1 = 1:3,
  g2 = 1:3,
  x = 1:3,
  y = 4:6
)

gdf <- group_by(df, g1, g2)

transmute(gdf, x = x + 1, z = x + 1, y = y + 1, g1 = g1 + 1)

# CRAN:
# - The column order supplied in `...` is kept.
# - Grouping vars not modified by `...` are kept at the front.

#> # A tibble: 3 × 5
#> # Groups:   g1, g2 [3]
#>      g2     x     z     y    g1
#>   <int> <dbl> <dbl> <dbl> <dbl>
#> 1     1     2     3     5     2
#> 2     2     3     4     6     3
#> 3     3     4     5     7     4

# Current dev:
# The column ordering comes from:
# - Modified columns are altered in place
# - New columns are added at the end

#> # A tibble: 3 × 5
#> # Groups:   g1, g2 [3]
#>      g1    g2     x     y     z
#>   <dbl> <int> <dbl> <dbl> <dbl>
#> 1     2     1     2     5     3
#> 2     3     2     3     6     4
#> 3     4     3     4     7     5

It turns out that many people want transmute() to keep the current CRAN behavior because they use it for mixed selection and mutation and expect a specific column ordering from it.

We previously believed that transmute() had swapped between these two outputs over various dplyr releases, but this was actually not true, as seen in #6080 (comment). The current CRAN behavior has always been how transmute() works. In light of this, we believe we should retain the CRAN behavior of transmute() for the next dplyr release.

That said, mutate(.keep = "none") should retain its current dev behavior. This is an experimental argument, so changing it should not affect too many users. The dev behavior of .keep = "none" is overall more consistent with the rest of the mutate() options, makes it easier to predict the output when combined with .before and .after, and simplifies the implementation because it means that .keep never affects the column ordering, it is mainly about which columns get dropped (#6035 goes into this in great detail).

So, the action items are:

Fix transmute() to revert to the CRAN behavior, which requires giving it its own implementation separate from mutate()
Update the NEWS bullet to only mention the change in .keep
Separate any comparison of transmute() and .keep = "none" in the documentation, making it clear how those are different

The text was updated successfully, but these errors were encountered:

…option is set This PR does two things to match some dplyr behaviour around column order: 1) Mimics dplyr implementation of `mutate(..., .keep = "none")` to append new columns after the existing columns (if suggested) as [per](tidyverse/dplyr#6086) 2) As per this [discussion](tidyverse/dplyr#6086), this required a bespoke approach to `transmute` as it not simply a wrapper for `mutate(..., .keep = "none")`. This cascades into needing to catch a couple edge cases. I have also added some tests which will test for this behaviour. Closes #12818 from boshek/mutate-keep Authored-by: SAm Albers <[email protected]> Signed-off-by: Jonathan Keane <[email protected]>

…option is set This PR does two things to match some dplyr behaviour around column order: 1) Mimics dplyr implementation of `mutate(..., .keep = "none")` to append new columns after the existing columns (if suggested) as [per](tidyverse/dplyr#6086) 2) As per this [discussion](tidyverse/dplyr#6086), this required a bespoke approach to `transmute` as it not simply a wrapper for `mutate(..., .keep = "none")`. This cascades into needing to catch a couple edge cases. I have also added some tests which will test for this behaviour. Closes apache#12818 from boshek/mutate-keep Authored-by: SAm Albers <[email protected]> Signed-off-by: Jonathan Keane <[email protected]>

…plyr#6086)

DavisVaughan mentioned this issue Nov 16, 2021

Separate transmute() and mutate(.keep = "none") #6087

Merged

hughjonesd mentioned this issue Nov 16, 2021

dplyr 1.0.8 hughjonesd/huxtable#215

Closed

DavisVaughan closed this as completed in #6087 Nov 17, 2021

eitsupi mentioned this issue Dec 26, 2021

update test for dplyr 1.0.8 privefl/bigsnpr#274

Closed

boshek mentioned this issue Apr 6, 2022

ARROW-16038: [R] different behavior from dplyr when mutate's .keep option is set apache/arrow#12818

Closed

pwwang added a commit to pwwang/datar-pandas that referenced this issue Apr 13, 2023

👽️ [dplyr] Separate transmute() and mutate(_keep="none") (tidyverse/d…

b108cbc

…plyr#6086)

DavisVaughan mentioned this issue Jul 17, 2023

mutate superseding transmute should allow ordering columns #6861

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate `transmute()` from `mutate(.keep = "none")` #6086

Separate `transmute()` from `mutate(.keep = "none")` #6086

DavisVaughan commented Nov 16, 2021

Separate transmute() from mutate(.keep = "none") #6086

Separate transmute() from mutate(.keep = "none") #6086

Comments

DavisVaughan commented Nov 16, 2021

Separate `transmute()` from `mutate(.keep = "none")` #6086

Separate `transmute()` from `mutate(.keep = "none")` #6086