0

I have a data frame and I'm trying to calculate median for each group separately. When I separate the data frame in two groups and calculate the median for each one, I am getting an NA result.

The data is :

    x1  x2  x3  x4  x5  x6  x7  y1  y2  y3  y4  y5  y6  y7  y8
9.488404158 9.470895414 9.282433728 9.366707445 9.955383045 9.640816474   9.606262272   9.329651027 9.434541611 9.473922432 9.311412966 9.3154885   9.434977488 9.470895414 9.764258059
8.630629966 8.55831075  8.788391003 8.576231135 8.671587906 8.842979993 8.861958856 8.58330436  8.603596508 8.570129609 8.59798922  8.572686772 8.679751791 8.663950953 8.432875347
9.354748885 9.367668838 9.259952558 9.421538213 9.554635162 9.603744578 9.452197983 9.284228877 9.404607878 9.317737979 9.343115301 9.310644266 9.27227486  9.360337823 9.44706281
9.944863964 9.950427516 10.19101759 10.07350804 10.03269879 10.1307908  10.03487287 9.74609383  9.886379007 9.775472567 10.036596   9.544738458 9.699611598 9.911962567 9.625804277

Code:

  rowN <- nrow(AT1)
  MD1<-vector(length=rowN)
  MD2<-vector(length=rowN)

   MD1[1:rowN]<-NA
   MD2[1:rowN]<-NA


 x<- AT1[,c(2,3,4,5,6,7,8) ]
  write.csv(x,"x.csv",row.names=TRUE)
  x<-as.matrix(x)
  for(i in 2:rowN) { 
  MD1[i]=median(x[i,])
  }
 write.csv(MD1,"MD1.csv",row.names=TRUE)

  y<- AT1[,c(9,10,11,12,13,14,15,16)]
  write.csv(y,"y.csv",row.names=TRUE)
  y<-as.matrix(y)
  for(j in 2:rowN) {
  MD2[j]=median(y[j,])
  }
  write.csv(MD2,"MD2.csv",row.names=TRUE)
Ben Bolker
  • 192,494
  • 24
  • 350
  • 426

1 Answers1

3

It would have been better to show a reproducible example. Based on the loop code, it seems to me that the OP want to get the median of each row. Assuming that the median is calculated for columns 2:8 and for 9:16 separately, we convert the 'data.frame' to 'matrix' (as.matrix) and use rowMedians from library(matrixStats).

x1 <- as.matrix(AT1[2:8 ])
x2 <- as.matrix(AT1[9:16])

library(matrixStats)
rowMedians(x1, na.rm=TRUE)
#[1] -0.09411013 -0.08554095  0.11953107 -0.26869311  0.33224445

rowMedians(x2, na.rm=TRUE)
#[1]  0.10557881 -0.74135403 -0.05876725  0.69230776 -0.21402339

data

set.seed(24)
m1 <- matrix(rnorm(5*15), ncol=15)
AT1 <- data.frame(col1= LETTERS[1:5], m1)
akrun
  • 789,025
  • 32
  • 460
  • 575
  • the error produce in the second group (y ) and its : There were 50 or more warnings (use warnings() to see the first 50) – shawin karim Aug 31 '15 at 11:31
  • @shawinkarim Without a reproducible example, I can't comment. If I use some standard example, my code should work. – akrun Aug 31 '15 at 11:32
  • iam typing your name akrun but its disappear at the begging ? – shawin karim Aug 31 '15 at 11:47
  • @shawinkarim that's because the author of a post is always notified, so starting the message with '@ their_name' is redundant when there's no other comment author :) Just the way SO works, but I agree it could be disturbing at first (Will delete this when I'll see a +1 on the comment meaning it's been read :p) – Tensibai Aug 31 '15 at 12:46
  • 2
    @shawinkarim, if you want more detailed answers, please include a reproducible example (see [here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)) that clearly shows what is going wrong. The answer of @akrun nicely shows that you do not need a `for` loop to calculate the median per row of a matrix, and should also work on your data. – Paul Hiemstra Aug 31 '15 at 14:07