-1

I have a dataset with multiple columns. In some columns all values are missing, in other columns only a few values. When importing to R, the missing values are only shown as empty cells and not as NA. This results in the problem that when I want to remove NA values with the na.omit() function nothing happens. How should I handle this problem?

Gabesz
  • 135
  • 5

2 Answers2

1

You could replace all the empty values with NA by using the following code:

library(dplyr)
your_data_with_NA <- your_data %>% 
  mutate_all(na_if, "")
Quinten
  • 5,998
  • 2
  • 6
  • 24
1

The easiest is often to fix this when importing your data. Most functions for importing data has some argument to specify which values should be interpreted as NA, for example:

  • read.csv and the like: na.strings
  • readr::read_csv and the like: na
  • readxl::read_excel and the like: na

Just set this argument to "", " " or to any value that is not interpreted as NA by R.

If this is not an option for you, you can replace the blank values with NA using:

library(dplyr)

df %>% 
  mutate(
    across(
      everything(),
      ~na_if(.x, "")
    )
  )

Note that mutate(across(everything())) has superseeded mutate_all but does the same.


Data

df <- mtcars %>% head()

df[1, 1] <- ""
jpiversen
  • 2,584
  • 1
  • 5
  • 11