8

Suppose that I have a data.frame as follows:

   a  b c
1  5 NA 6
2 NA NA 7
3  6  5 8

I would like to find the length of each column, excluding NA's. The answer should look like

a b c 
2 1 3 

So far, I've tried:

 !is.na()                  # Gives TRUE/FALSE
 length(!is.na())          # 9 -> Length of the whole matrix
 dim(!is.na())             # 3 x 3 -> dimension of a matrix
 na.omit()                 # removes rows with any NA in it.

Please tell me how can I get the required answer.

Hong Ooi
  • 54,701
  • 13
  • 127
  • 173
Ay_M
  • 193
  • 1
  • 2
  • 8

4 Answers4

11

Or faster :

colSums(!is.na(dat))
a b c 
2 1 3 
agstudy
  • 116,828
  • 17
  • 186
  • 250
3
> apply(dat, 2, function(x){sum(!is.na(x))})
a b c 
2 1 3 
user1609452
  • 4,366
  • 1
  • 12
  • 20
2

Though the sum is probably a faster solution, I think that length(x[!is.na(x)]) is more readable.

Dan Chaltiel
  • 6,883
  • 4
  • 43
  • 77
0

I tried NCOL instead of ncol and it worked.

> nrow(tsa$Region)
NULL
> NROW(tsa$Region)
[1] 27457


> ncol(tsa$Region)
NULL
> NCOL(tsa$Region)
[1] 1