-
Notifications
You must be signed in to change notification settings - Fork 985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fwrite integer rownames #5098
Fwrite integer rownames #5098
Conversation
Include row.names from R
…e number at the end of the first sentence)
R/fwrite.R
Outdated
@@ -36,6 +36,7 @@ fwrite = function(x, file="", append=FALSE, quote="auto", | |||
nThread = as.integer(nThread) | |||
# write.csv default is 'double' so fwrite follows suit. write.table's default is 'escape' | |||
# validate arguments | |||
rn = if (row.names) row.names(x) else NULL # allocate row.names in R to address integer row.names #4957 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your reasoning for this; i.e. option 1 you detailed in issue. However I'm just a bit concerned about any inadvertent usage that ends up somehow calling row.names=TRUE without the user intending to. Then 1:nrow
will be coerced to character by this line and the global character cache gets clobbered. Would prefer not to leave that door open. Another door open would be benchmarkers who either deliberately or by accident display results with row.names = TRUE and show poor performance.
Hence I went for option 3 as you described.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Brilliant! Thanks for your time - changes look great!
Closes #4957 .
The biggest change is behavior of
quote = auto
for default row names and for integer assigned row-names because this PR makes all row names characters.I believe this better matches existing behavior with
quote = 'auto'
as the current behavior will not add double quotes to character row.names:Finally, based on this PR, this part of
fwrite.c
shown below is no longer used. In the issue, I comment about how to do this more directly in C. I would be happy to move forward with a C approach but it just seems like a higher diff for little productivity. While maybe big data people who need performance use row names, overall I doubt it.data.table/src/fwrite.c
Lines 876 to 881 in 831013a