Replace custom values with NA

Question

df = data.frame("a" = c(1, 2, 3, "q", "r"),
                "b" = c(5,6,7,0,"s"))
dfWANT = data.frame("a" = c(1, 2, 3, "NA", "NA"),
                    "b" = c(5,6,7,0,"NA"))
REP = c("q", "r", "s")

df[,][df[,] == REP] <- NA

I aim to specify a list(REP) that has the scores I want to set to NA. Original data is df and the one I want is dfWANT. REP is the vector of values I want to set to NA and the last line is my attempt that works only on col a.

Or: `df = as.data.frame(lapply(df, function(x) ifelse(x%in%REP, NA,x)), stringsAsFactors = F)` — R. Schifini, Feb 08 '20 at 02:57
@R.Schifini Thanks so much but I do not wish to get rid of all characteristics/strings just the ones I specify in REP. — bvowe, Feb 08 '20 at 02:59
Not positive, but seems like this question should be mostly covered by [this one](https://stackoverflow.com/q/24172111/5325862) — camille, Feb 08 '20 at 19:06

Ronak Shah · Accepted Answer · 2020-02-08T03:23:19.100

3

You could use sapply to get a logical matrix of TRUE/FALSE value based on existence of REP value in it. We can then replace those TRUE values with NA.

df[sapply(df, `%in%`, REP)] <- NA

#     a    b
#1    1    5
#2    2    6
#3    3    7
#4 <NA>    0
#5 <NA> <NA>

In dplyr, we can use mutate_all

library(dplyr)
df %>% mutate_all(~replace(., . %in% REP, NA))

edited Feb 08 '20 at 03:23

answered Feb 08 '20 at 03:05

Ronak Shah

355,584
18
123
178

akrun · Answer 2 · 2020-02-08T18:19:55.453

1

We can convert the data.frame to matrix and do the %in% without looping in base R

df[`dim<-`(as.matrix(df) %in% REP, dim(df))] <- NA
df
#     a    b
#1    1    5
#2    2    6
#3    3    7
#4 <NA>    0
#5 <NA> <NA>

Or using the efficient data.table

library(data.table)
setDT(df)
for(j in seq_along(df)) set(df, i = which(df[[j]] %in% REP),  j=j, value = NA_character_)

edited Feb 08 '20 at 18:19

answered Feb 08 '20 at 18:14

akrun

789,025
32
460
575

Replace custom values with NA

2 Answers2