Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Column-level na values via collectors #532

Open
khusmann opened this issue Mar 25, 2024 · 0 comments
Open

Feature request: Column-level na values via collectors #532

khusmann opened this issue Mar 25, 2024 · 0 comments

Comments

@khusmann
Copy link

Right now, NA values are specified globally via na arg in the read_*() family of functions. Sometimes I want to supply NA values for specific columns, rather than the entire data set. A nice way to do this could be to add an na arg to all of the collector types to specify column-level missing values.

Column-level missing values come up frequently in survey data. Here are two examples:

Example 1:

What is your current stress level?
a. Low (LOW)
b. Moderate (MODERATE)
c. High (HIGH)
d. I don’t know (DONT_KNOW)
e. I don’t understand the question (DONT_UNDERSTAND)

I'd like to be able to create a col_factor type that reads the last two responses as NA as follows:

col_factor(levels = c("LOW", "MODERATE", "HIGH"), ordered = TRUE, na = c("DONT_KNOW", "DONT_UNDERSTAND"))

Example 2:

An item that records the individual's height as a double, but can have the following missing values: "ABSENT", "RULER_BROKE"

col_double(na = c("ABSENT", "RULER_BROKE"))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant