Hey guys I'm doing a stats project and need help. I have a data set and I'm trying to find which county has the highest mean weight and which county has the lowest mean weight. What command can I use to find the mean weight of every county. All the information I have is the weights of everyone in the data set and the counties.
Asked
Active
Viewed 40 times
-1
-
See `help('mean')`, `help('min')` and `help('which.min')`. And the same for `max`. – Rui Barradas May 02 '20 at 15:59
-
it would be important to know how your data are structured in R. Probably, the post "https://stackoverflow.com/questions/21982987/mean-per-group-in-a-data-frame" is going to answer your question. If not, please give us a glimpse of how your data looks like. E.g. you could use `head(df)` to print the top elements in a data frame so that we get a clearer idea of what you need. – Jan May 02 '20 at 16:05
-
Hey I tried to use head(df) and got function (x, df1, df2, ncp, log = FALSE) 2 { 3 if (missing(ncp)) 4 .Call(C_df, x, df1, df2, log) 5 else .Call(C_dnf, x, df1, df2, ncp, log) I'm not too sure what this means. That other post used an aggregate function which I'm not familiar with and know we haven't been taught so I'm not comfortable using it. Thanks – John May 02 '20 at 16:12
-
What name does the object have in which your data are stored? That is the argument needed by the head-function. So go with `head( [MyDataSet] )`. – Jan May 02 '20 at 16:25
1 Answers
1
could you show a bit of the data. I asume yours looks something like
| County | Person | Weight |
|--------|--------|--------|
| A | Joe | 76 |
| A | Mary | 68 |
| A | Lucy | 59 |
| B | Carlos | 60 |
| B | Lucas | 80 |
| B | Lola | 50 |
| C | Pierre | 70 |
| C | Xavier | 89 |
| C | Simone | 56 |
If it's this case I would use the aggregate function like this
df_ag <- aggregate(df[,3], list(df$County), mean) #being df the name of your dataframe you want to aggregate, by df$County if you want to aggregate by County and then mean if you want to apply the mean formula
That makes you a new dataframe "df_ag" that you can sort with
df_ag <- df_ag[order(df_ag$County),]
which sorts your new dataframe by ascending order of the mean (if you want a descending order you can add "-" to the variable you want to order by). And finally you can get the higher mean with a head(df_ag) and the lower observation with tail(df_ag)(or the other way around if you sorted it by descending order.
You can check https://www.statmethods.net/management/sorting.html, http://rfunction.com/archives/699 and Mean per group in a data.frame
Enrique
- 33
- 5