0

I have the following data set:

df <- data.frame(
  C      = c(1,2,3,1,2,3,1,2,3,1),
  weight = c(1,1.5,2,2,1.5,1,2,1,1.5,2.5),
  time   = c(15,20,30,45,60,15,20,30,45,60)
)

I need to aggregate the data by the variable C in order to find the median time for each C. Each observation is weighted by the variable 'weight'.

Is there a way to replace 'mean' by a weighted median in the following code ?

output<-aggregate(.~C, data=df, mean, na.rm=TRUE)
Richie Cotton
  • 113,548
  • 43
  • 231
  • 352
user2568648
  • 2,771
  • 7
  • 32
  • 50

1 Answers1

1

There is a weighted median function in the bigvis package on github.

library(devtools)
install_github("bigvis")

aggregate doesn't work with functions that need multiple vector inputs. Use ddply from plyr instead.

library(plyr)
ddply(df, .(C), summarise, wm = weighted.median(time, weight))
Richie Cotton
  • 113,548
  • 43
  • 231
  • 352
  • When trying to install bigvis I get the following error: Error in function (type, msg, asError = TRUE) : Could not resolve host: github.com; Host not found – user2568648 Jan 23 '14 at 14:49
  • @user2568648 Are you on a corporate network? If so, the most likely explanation is that access to github is blocked by your network admins. Try and reach the site in a browser. – Richie Cotton Jan 23 '14 at 15:20