0

I have a data frame with blanks at different positions. I would like to calculate a mean of the rows but I do not know how to make the n flexible depending on the row.

All the columns have the same amount of rows.

mean = sum/n

df1 <- data.frame(col1 = c(5, 9, NA, -0.9, -0.74, , 1.19, , -1, -0.4, 1.38, -1.5, 1, 0.64), 
                  col2 = c(4, 2, -9, 4, 19, 31, 4, -8, -15,NA,NA,NA,NA,NA),
                  col3 = c(1, -2, 5, 1.1, 33, 2, 7, 1, 1, 16, -22, - 2, -3,-10))

So that, for row1:

5+4+1 = 9/3 = 3

but for row 3: (-9) + 5 = -4/2 = -2

Thank you very much for your help!

student24
  • 252
  • 1
  • 7
  • You can just omit missing values: rowMeans(df1, na.rm=TRUE) – kevin Oct 01 '20 at 05:29
  • When I tried that, I get an error: "Error in rowMeans(df1, na.rm = TRUE) : 'x' must be numeric." My values are numeric apart from the blanks, if that what it means. – student24 Oct 01 '20 at 05:36
  • Since you have those "blanks" (-0.74, ,1.19), you should convert them to NAs, take a look at this post: https://stackoverflow.com/questions/24172111/change-the-blank-cells-to-na. If you read the dataframe from a file (e.g., a csv) you could do it online with the na.strings parameter. – kevin Oct 01 '20 at 06:16

2 Answers2

4

If you have blanks in your data that will turn your data to character even though it looks numeric. Turn them to NA, convert the column to numeric and then take mean.

df[df == ''] <- NA
df[] <- lapply(df, as.numeric)
df$rowMean <- rowMeans(df, na.rm = TRUE)
Ronak Shah
  • 355,584
  • 18
  • 123
  • 178
1

Try this:

 df1 <- data.frame(col1 = as.numeric(c(5, 9, NA, -0.9, -0.74, NA , 1.19, 2 , -1, -0.4, 1.38, -1.5, 1, 0.64)), 
                  col2 = as.numeric(c(4, 2, -9, 4, 19, 31, 4, -8, -15,NA,NA,NA,NA,NA)),
                  col3 = as.numeric(c(1, -2, 5, 1.1, 33, 2, 7, 1, 1, 16, -22, -2, -3,-10)))

 rowMeans(df1, na.rm=TRUE)

It is possible, that altough your data is numeric, R read them in as a character.

Elias
  • 666
  • 4
  • 19