From c3d1100cf814ee16d003f8ab294dcf8e93d75d16 Mon Sep 17 00:00:00 2001 From: Cole Miller <57992489+ColeMiller1@users.noreply.github.com> Date: Tue, 10 Aug 2021 03:55:07 -0400 Subject: [PATCH] Fwrite integer rownames (#5098) --- NEWS.md | 2 ++ inst/tests/tests.Rraw | 35 +++++++++++++++++++++-------------- src/fwrite.c | 20 ++++++++++++-------- src/fwrite.h | 3 ++- src/fwriteR.c | 14 +++++++++++++- 5 files changed, 50 insertions(+), 24 deletions(-) diff --git a/NEWS.md b/NEWS.md index 3a9ffbca5..8bb857cef 100644 --- a/NEWS.md +++ b/NEWS.md @@ -163,6 +163,8 @@ 30. `fread(file=URL)` now works rather than error `does not exist or is non-readable`, [#4952](https://github.com/Rdatatable/data.table/issues/4952). `fread(URL)` and `fread(input=URL)` worked before and continue to work. Thanks to @pnacht for reporting and @ben-schwen for the PR. +31. `fwrite(DF, row.names=TRUE)` where `DF` has specific integer rownames (e.g. using `rownames(DF) <- c(10L,20L,30L)`) would ignore the integer rownames and write the row numbers instead, [#4957](https://github.com/Rdatatable/data.table/issues/4957). Thanks to @dgarrimar for reporting and @ColeMiller1 for the PR. Further, when `quote='auto'` (default) and the rownames are integers (either default or specific), they are no longer quoted. + ## NOTES 1. New feature 29 in v1.12.4 (Oct 2019) introduced zero-copy coercion. Our thinking is that requiring you to get the type right in the case of `0` (type double) vs `0L` (type integer) is too inconvenient for you the user. So such coercions happen in `data.table` automatically without warning. Thanks to zero-copy coercion there is no speed penalty, even when calling `set()` many times in a loop, so there's no speed penalty to warn you about either. However, we believe that assigning a character value such as `"2"` into an integer column is more likely to be a user mistake that you would like to be warned about. The type difference (character vs integer) may be the only clue that you have selected the wrong column, or typed the wrong variable to be assigned to that column. For this reason we view character to numeric-like coercion differently and will warn about it. If it is correct, then the warning is intended to nudge you to wrap the RHS with `as.()` so that it is clear to readers of your code that a coercion from character to that type is intended. For example : diff --git a/inst/tests/tests.Rraw b/inst/tests/tests.Rraw index ea8abd753..0ccda6f46 100644 --- a/inst/tests/tests.Rraw +++ b/inst/tests/tests.Rraw @@ -10706,23 +10706,30 @@ test(1733.2, fwrite(data.table(c(1.2,-8.0,pi,67.99),1:4),dec=",",sep=";"), # fwrite implied and actual row.names DT = data.table(foo=1:3,bar=c(1.2,9.8,-6.0)) -test(1734.1, capture.output(fwrite(DT,row.names=TRUE,quote=FALSE)), - capture.output(write.csv(DT,quote=FALSE))) -test(1734.2, capture.output(fwrite(DT,row.names=TRUE,quote=TRUE)), - capture.output(write.csv(DT))) -test(1734.3, fwrite(DT,row.names=TRUE,quote='auto'), # same other than 'foo' and 'bar' column names not quoted - output="\"\",foo,bar\n\"1\",1,1.2\n\"2\",2,9.8\n\"3\",3,-6") +test(1734.01, capture.output(fwrite(DT,row.names=TRUE,quote=FALSE)), + capture.output(write.csv(DT,quote=FALSE))) +test(1734.02, capture.output(fwrite(DT,row.names=TRUE,quote=TRUE)), + capture.output(write.csv(DT))) +test(1734.03, fwrite(DT,row.names=TRUE,quote='auto'), # same other than 'foo' and 'bar' column names not quoted + output="\"\",foo,bar\n1,1,1.2\n2,2,9.8\n3,3,-6") DF = as.data.frame(DT) -test(1734.4, capture.output(fwrite(DF,row.names=TRUE,quote=FALSE)), - capture.output(write.csv(DF,quote=FALSE))) -test(1734.5, capture.output(fwrite(DF,row.names=TRUE,quote=TRUE)), - capture.output(write.csv(DF))) +test(1734.04, capture.output(fwrite(DF,row.names=TRUE,quote=FALSE)), + capture.output(write.csv(DF,quote=FALSE))) +test(1734.05, capture.output(fwrite(DF,row.names=TRUE,quote=TRUE)), + capture.output(write.csv(DF))) rownames(DF)[2] = "someName" rownames(DF)[3] = "another" -test(1734.6, capture.output(fwrite(DF,row.names=TRUE,quote=FALSE)), - capture.output(write.csv(DF,quote=FALSE))) -test(1734.7, capture.output(fwrite(DF,row.names=TRUE,quote=TRUE)), - capture.output(write.csv(DF))) +test(1734.06, capture.output(fwrite(DF,row.names=TRUE,quote=FALSE)), + capture.output(write.csv(DF,quote=FALSE))) +test(1734.07, capture.output(fwrite(DF,row.names=TRUE,quote=TRUE)), + capture.output(write.csv(DF))) +rownames(DF) = c(10L, -20L, 30L) ## test for #4957 +test(1734.08, capture.output(fwrite(DF, row.names=TRUE, quote=TRUE)), + capture.output(write.csv(DF))) +test(1734.09, capture.output(fwrite(DF, row.names=TRUE, quote=FALSE)), + capture.output(write.csv(DF, quote=FALSE))) +test(1734.10, fwrite(DF, row.names=TRUE, quote='auto'), + output=c('"",foo,bar','10,1,1.2','-20,2,9.8','30,3,-6')) # list columns and sep2 set.seed(1) diff --git a/src/fwrite.c b/src/fwrite.c index f7f400318..2d10d222f 100644 --- a/src/fwrite.c +++ b/src/fwrite.c @@ -623,8 +623,8 @@ void fwriteMain(fwriteMainArgs args) DTPRINT(_("... ")); for (int j=args.ncol-10; j