Combing regression output for by() into a single table

Question

I'm new to R, coding, and Stack Overflow: Apologies in advance if this is a basic question. I'm trying to combine the regression output for 3 levels of the variable "Gender" into a single summary table that retains all of the information from the columns as well as the values (residual error, r2, adjusted r2, F-statistic, p-value) listed at the bottom of each output. Is anyone aware of an approach that works?

Here is what my output currently looks like:

library(tidyverse)
Final_Frame.df <- read_csv("indirect.csv")

my.fun <- function(Final_Frame2.df){summary(lm(Product_Use~Mean_social_combined +
  Mean_traditional_time+
  Mean_Passive_Use_Updated+
  Mean_Active_Use_Updated, data=Final_Frame.df))}

by(Final_Frame.df, list(Final_Frame.df$Gender), my.fun)

Output

Call:
lm(formula = Product_Use ~ Mean_social_combined + Mean_traditional_time + 
    Mean_Passive_Use_Updated + Mean_Active_Use_Updated, data = Final_Frame.df)

Residuals:
    Min      1Q  Median      3Q     Max 
-26.592  -8.178  -3.936   6.228  62.258 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)               -0.5814     1.9664  -0.296 0.767612    
Mean_social_combined       2.4961     1.1797   2.116 0.034906 *  
Mean_traditional_time      1.0399     0.7416   1.402 0.161567    
Mean_Passive_Use_Updated   2.8230     0.8308   3.398 0.000739 ***
Mean_Active_Use_Updated    2.7562     1.7421   1.582 0.114329    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 12.07 on 451 degrees of freedom
  (18 observations deleted due to missingness)
Multiple R-squared:  0.1517,    Adjusted R-squared:  0.1442 
F-statistic: 20.17 on 4 and 451 DF,  p-value: 2.703e-15

--------------------------------------------------------------------------------------------- 
: 2

Call:
lm(formula = Product_Use ~ Mean_social_combined + Mean_traditional_time + 
    Mean_Passive_Use_Updated + Mean_Active_Use_Updated, data = Final_Frame.df)

Residuals:
    Min      1Q  Median      3Q     Max 
-26.592  -8.178  -3.936   6.228  62.258 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)               -0.5814     1.9664  -0.296 0.767612    
Mean_social_combined       2.4961     1.1797   2.116 0.034906 *  
Mean_traditional_time      1.0399     0.7416   1.402 0.161567    
Mean_Passive_Use_Updated   2.8230     0.8308   3.398 0.000739 ***
Mean_Active_Use_Updated    2.7562     1.7421   1.582 0.114329    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 12.07 on 451 degrees of freedom
  (18 observations deleted due to missingness)
Multiple R-squared:  0.1517,    Adjusted R-squared:  0.1442 
F-statistic: 20.17 on 4 and 451 DF,  p-value: 2.703e-15

--------------------------------------------------------------------------------------------- 
: 3

Call:
lm(formula = Product_Use ~ Mean_social_combined + Mean_traditional_time + 
    Mean_Passive_Use_Updated + Mean_Active_Use_Updated, data = Final_Frame.df)

Residuals:
    Min      1Q  Median      3Q     Max 
-26.592  -8.178  -3.936   6.228  62.258 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)               -0.5814     1.9664  -0.296 0.767612    
Mean_social_combined       2.4961     1.1797   2.116 0.034906 *  
Mean_traditional_time      1.0399     0.7416   1.402 0.161567    
Mean_Passive_Use_Updated   2.8230     0.8308   3.398 0.000739 ***
Mean_Active_Use_Updated    2.7562     1.7421   1.582 0.114329    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 12.07 on 451 degrees of freedom
  (18 observations deleted due to missingness)
Multiple R-squared:  0.1517,    Adjusted R-squared:  0.1442 
F-statistic: 20.17 on 4 and 451 DF,  p-value: 2.703e-15

Have a look at `ddply `, something like: `ddply( Final_Frame.df, 'Gender', function(d) { create_the_data_frame_you_need() } ) ` # In the function body you have access to `d` which is the subset of your data for each unique value of Gender — Sirius, Mar 14 '21 at 13:33
In order for us to help you, please edit your question to include a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). For example, to produce a minimal data set, you can use `head()`, `subset()`, or the indices. Then use `dput()` to give us something that can be put in R immediately. Also, please make sure you know what to do [when someone answers your question](https://stackoverflow.com/help/someone-answers). More info can be found at StackOverflow's [help center](https://stackoverflow.com/help). Thank you! — iamericfletcher, Mar 14 '21 at 16:35

G. Grothendieck · Accepted Answer · 2021-03-14T14:07:49.713

1) broom This will produce a data frame of the coefficients and another of the statistics using tidy and glance from the broom package:

library(broom)
library(dplyr)

mtcars %>%
  group_by(cyl) %>%
  group_modify(~ tidy(lm(mpg ~ disp + hp, .))) %>%
  ungroup

mtcars %>%
  group_by(cyl) %>%
  group_modify(~ glance(lm(mpg ~ disp + hp, .))) %>%
  ungroup

2) combined model Although not equivalent it would be possible to create a single model. It does produce the same coefficients.

summary(lm(mpg ~ factor(cyl)/(disp + hp) + 0, mtcars))

3) nlme Also this gives some of the same information. nlme comes with R so it does not have to be installed, only loaded using library as below.

library(nlme)
summary(lmList(mpg ~ disp + hp | cyl, mtcars, pool = FALSE))

Combing regression output for by() into a single table

1 Answers1