-2

I am looking to find the summary statistics (mean and potentially standard deviation and other quantities) of a vector (column) in a data frame, but grouped. I hope to group the statistics by another categorical variable

I know that one find summary as

summary(data$rating)

however I am not sure how I find summary statistics for gender separately.

I tried

summary(data$rating, data$gender)

but that does give my anything but summary(data$rating)

marc_s
  • 704,970
  • 168
  • 1,303
  • 1,425
pkpkPPkafa
  • 13
  • 1
  • 10

2 Answers2

0

You could also use the by function:

by(data$rating, data$gender, summary)
cimentadaj
  • 1,264
  • 9
  • 20
-1

Use tapply() or aggregate():

data <- data.frame(rating = 100*runif(30), 
                   gender=sample(c("female","male"),30, replace=TRUE))

tapply(data$rating, data$gender, summary)

aggregate(data$rating, by=list(data$gender), 
      FUN=function(x) cbind(mean(x), median(x), sd(x)))
Bernhard
  • 3,801
  • 1
  • 11
  • 21