0

I have a R data frame df below

a   b   c

1   6  NA
2  NA  4
3   7  NA
NA  8  1
4   9  10
NA  NA  7
5   10  8

I want to remove the row which has NA in BOTH a & b

My desired output will be

a   b  c

1   6  NA
2  NA  4
3   7  NA
NA  8  1
4   9  10
5  10  8

I tried something like this below

df1<-df[(is.na(df$a)==FALSE & is.na(df$b)==FALSE),]

but this removes all the NAs (performs an OR function). I need to do AND operation here.

How do i do it ?

GaRaGe
  • 57
  • 1
  • 2
  • 9

6 Answers6

2

You can try :

df1<-df[!(is.na(df$a) & is.na(df$b)), ]
Kumar Manglam
  • 2,650
  • 1
  • 16
  • 27
2

using rowSums

df[!rowSums(is.na(df))==2,]

better one by saving a character[1]

df[rowSums(is.na(df))!=2,]

output:

   a  b
1  1  6
2  2 NA
3  3  7
4 NA  8
5  4  9
7  5 10

can be generalized using ncol

df[!rowSums(is.na(df))==ncol(df),]

[1] credits: alistaire

Community
  • 1
  • 1
Prradep
  • 5,049
  • 4
  • 37
  • 76
1

We can use rowSums on a logical matrix (is.na(df1)) and convert that to a logical vector (rowSums(...) < ncol(df1)) to subset the rows.

df1[rowSums(is.na(df1)) < ncol(df1),]

Or another option is Reduce with lapply

df1[!Reduce(`&`, lapply(df1, is.na)),]
akrun
  • 789,025
  • 32
  • 460
  • 575
1

Another approach

df[!apply(is.na(df),1,all),]
#   a  b
#1  1  6
#2  2 NA
#3  3  7
#4 NA  8
#5  4  9
#7  5 10

Data

df <- structure(list(a = c(1L, 2L, 3L, NA, 4L, NA, 5L), b = c(6L, NA, 
7L, 8L, 9L, NA, 10L)), .Names = c("a", "b"), class = "data.frame", row.names = c(NA, 
-7L))
user2100721
  • 3,517
  • 2
  • 19
  • 29
0

this will also work:

df[apply(df, 1, function(x) sum(is.na(x)) != ncol(df)),]

   a  b
1  1  6
2  2 NA
3  3  7
4 NA  8
5  4  9
7  5 10
Sandipan Dey
  • 19,788
  • 2
  • 37
  • 54
0

My thought is basically the same with other replies.

Considering any dataset with a specific row having all NAs, the sum of !is.na(ROW) will always be zero. So you just have to take out that row.

So you can just do:

df1 = df[-which(rowSums(!is.na(df))==0),]
Chris
  • 27,139
  • 3
  • 23
  • 44