Do not allow `NA` as column name #316

jmbarbone · 2022-01-11T15:28:20Z

Resolves #292

@JanMarvin do you know how we can tell if the NA is converted to "NA" in the sheet data?

The change is in writeData so it should have an effect in buildWorkbook(). Just couldn't get my head around what writeData() and Workbook$writeData() were doing.

codecov-commenter · 2022-01-11T15:34:16Z

Codecov Report

Merging #316 (19f97da) into master (f4bf841) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #316   +/-   ##
=======================================
  Coverage   67.70%   67.71%           
=======================================
  Files          34       34           
  Lines        8897     8898    +1     
=======================================
+ Hits         6024     6025    +1     
  Misses       2873     2873

Impacted Files	Coverage Δ
R/writeData.R	`88.39% <100.00%> (+0.06%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5e44c5e...19f97da. Read the comment docs.

JanMarvin · 2022-01-11T18:38:22Z

You just want to know, if the string is imported as "NA"? If written, is is converted into a sharedString (<sst><si /></sst> in openxml). Therefore you could check if wb$sharedString contains "<si><t xml:space=\"preserve\">NA</t></si>".

The position in the shared string is a 0 index, matching the value of v + 1 in wb$worksheets[[1]]$sheet_data$v where the type is 1 in wb$worksheets[[1]]$sheet_data$t.

I know that feeling in writeData and friends. The construction of the shared string and the lists are deeply embedded in openxlsx and make it near impossible to really implement things like inlineStr.

JanMarvin · 2022-01-11T18:41:10Z

You could look into make.names() to create unique names, it is used in the package at various places, otherwise you might end up with multiple "NA"s which brings other problems.

jmbarbone · 2022-01-11T19:09:31Z

You could look into make.names() to create unique names, it is used in the package at various places, otherwise you might end up with multiple "NA"s which brings other problems.

Considered that. Not sure if it's that's to best. There's no requirement for data in sheets to be unique, so the "NA" seemed like a less invasive fix. make.names() is used for the reading, which makes it difficult to test this (thought I could make the workbook and then read it to make sure it was okay). Technically you can still have a data.frame() in R without unique names, it's just a bad idea:

x <- data.frame(x = 1)
x <- cbind(x, x)
x
#>   x x
#> 1 1 1
str(x)
#> 'data.frame':    1 obs. of  2 variables:
#>  $ x: num 1
#>  $ x: num 1

^{Created on 2022-01-11 by the reprex package (v2.0.1)}

I've seen plenty of workbooks with repeated column names. Maybe something a little nicer?

x <- c(NA, "a", "b", NA, "b", "a", NA)

make.names(x)
#> [1] "NA." "a"   "b"   "NA." "b"   "a"   "NA."
make.names(x, unique = TRUE)
#> [1] "NA."   "a"     "b"     "NA..1" "b.1"   "a.1"   "NA..2"

clean_na_names <- function(x) {
  w <- which(is.na(x))
  for (i in w) {
    x[w] <- paste0("NA..", w)
  }
  x
}

clean_na_names(x)
#> [1] "NA..1" "a"     "b"     "NA..4" "b"     "a"     "NA..7"

^{Created on 2022-01-11 by the reprex package (v2.0.1)}

What I'm worried about is disharmony between reading and writing though. Maybe warnings/controls for reading a workbook as a data.frame with duplicated column names?

github-actions · 2023-01-14T02:15:30Z

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

jmbarbone added 2 commits January 11, 2022 10:23

adds start to NA colnames

307695f

Update NEWS.md

19f97da

github-actions bot added the Stale label Jan 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do not allow `NA` as column name #316

Do not allow `NA` as column name #316

jmbarbone commented Jan 11, 2022 •

edited

Loading

codecov-commenter commented Jan 11, 2022 •

edited

Loading

JanMarvin commented Jan 11, 2022 •

edited

Loading

JanMarvin commented Jan 11, 2022

jmbarbone commented Jan 11, 2022

github-actions bot commented Jan 14, 2023

Do not allow NA as column name #316

Are you sure you want to change the base?

Do not allow NA as column name #316

Conversation

jmbarbone commented Jan 11, 2022 • edited Loading

codecov-commenter commented Jan 11, 2022 • edited Loading

Codecov Report

JanMarvin commented Jan 11, 2022 • edited Loading

JanMarvin commented Jan 11, 2022

jmbarbone commented Jan 11, 2022

github-actions bot commented Jan 14, 2023

Do not allow `NA` as column name #316

Do not allow `NA` as column name #316

jmbarbone commented Jan 11, 2022 •

edited

Loading

codecov-commenter commented Jan 11, 2022 •

edited

Loading

JanMarvin commented Jan 11, 2022 •

edited

Loading