1
df = data.frame("a" = c(1, 2, 3, "q", "r"),
                "b" = c(5,6,7,0,"s"))
dfWANT = data.frame("a" = c(1, 2, 3, "NA", "NA"),
                    "b" = c(5,6,7,0,"NA"))
REP = c("q", "r", "s")

df[,][df[,] == REP] <- NA

I aim to specify a list(REP) that has the scores I want to set to NA. Original data is df and the one I want is dfWANT. REP is the vector of values I want to set to NA and the last line is my attempt that works only on col a.

NelsonGon
  • 12,469
  • 5
  • 25
  • 52
bvowe
  • 2,476
  • 1
  • 13
  • 20

2 Answers2

3

You could use sapply to get a logical matrix of TRUE/FALSE value based on existence of REP value in it. We can then replace those TRUE values with NA.

df[sapply(df, `%in%`, REP)] <- NA

#     a    b
#1    1    5
#2    2    6
#3    3    7
#4 <NA>    0
#5 <NA> <NA>

In dplyr, we can use mutate_all

library(dplyr)
df %>% mutate_all(~replace(., . %in% REP, NA))
Ronak Shah
  • 355,584
  • 18
  • 123
  • 178
1

We can convert the data.frame to matrix and do the %in% without looping in base R

df[`dim<-`(as.matrix(df) %in% REP, dim(df))] <- NA
df
#     a    b
#1    1    5
#2    2    6
#3    3    7
#4 <NA>    0
#5 <NA> <NA>

Or using the efficient data.table

library(data.table)
setDT(df)
for(j in seq_along(df)) set(df, i = which(df[[j]] %in% REP),  j=j, value = NA_character_)
akrun
  • 789,025
  • 32
  • 460
  • 575