0
Call:
lm(formula = formal_engaged_replaced ~ setting_interest + setting_trust + 
    setting_contact + setting_confidence + setting_visibility + 
    network_close_network + network_help_neighbour + network_help_orgs + 
    personal_sex + poverty_replaced + personal_education, data = train_model)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.6084 -0.3168 -0.1750  0.4730  0.9894 

Coefficients:
                        Estimate Std. Error t value Pr(>|t|)    
(Intercept)             0.694967   0.172963   4.018 8.15e-05 ***
setting_interest        0.082989   0.036994   2.243   0.0259 *  
setting_trust          -0.095590   0.039447  -2.423   0.0162 *  
setting_contact         0.006969   0.040477   0.172   0.8635    
setting_confidence      0.023480   0.041886   0.561   0.5757    
setting_visibility      0.090994   0.040536   2.245   0.0258 *  
network_close_network   0.001323   0.004591   0.288   0.7734    
network_help_neighbour -0.008214   0.030320  -0.271   0.7867    
network_help_orgs       0.032154   0.032971   0.975   0.3306    
personal_sex            0.016613   0.059559   0.279   0.7806    
poverty_replaced        0.017542   0.034989   0.501   0.6167    
personal_education      0.035840   0.020845   1.719   0.0870 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.4294 on 212 degrees of freedom
Multiple R-squared:   0.11, Adjusted R-squared:  0.06385 
F-statistic: 2.383 on 11 and 212 DF,  p-value: 0.008439

I'm very confused with interpreting the variable, as the Multiple R-squared: 0.11, does it mean that it's bad ?

Also when running the multiple regression is it recommended to check for the assumption as my data is categorical its doesn't meet any assumption. and I just coded for regression without any transformation, is there any way to do it?

Dave
  • 62,186
none
  • 31
  • Do you mean check to see if your dependent variable meets the assumption that it’s normally distributed? – scott.pilgrim.vs.r Jan 21 '23 at 16:02
  • I mean, if i check for the data - it doesn't qualify any of the assumptions, is it necessary to check the assumptions, like independence, normal distribution etc in order to proceed with multivariate regression? – none Jan 22 '23 at 17:18
  • @none That question about checking for assumptions really warrants it’s own posted question, as it differs from the $R^2$ discussion. – Dave Jan 22 '23 at 17:22

1 Answers1

3

The context matters.

In general, it is difficult to assign labels like “good” and “bad” to any performance metric, be it $R^2$ or something else. Your value of $0.11$ is better than $0.10$ and worse than $0.12$. However, it is not reasonable to think of $R^2$ in terms of letter grades in school. It could be that your value is the best ever at a task like this, which sounds like an $A$-grade to me; or it could be than even $R^2=0.9$ is rather mediocre performance, when though such a number looks like an $A$-grade.

What your value of $0.11$ does tell you is that you have made an improvement upon naïvely predicting the overall mean of $\bar y$. While modelers might want to get much better predictions than such a naïve strategy would, you are doing something useful, rather than being outperformed by such a simple strategy.

If other work like yours is getting $R^2$ values around where you are, that should be encouraging. If other work is getting bigger values, that is less encouraging.

$R^2$ has some limitations, chiefly that it can be driven high by overfitting to the data. If you don’t want to do any kind of out-of-sample performance assessment, you might be interested in the adjusted $R^2$, which, loosely speaking, makes an attempt to penalize for overfitting, and I’ve given a more technical description here.

Dave
  • 62,186
  • Thankyou for the brief answer, for R2 the value should lie between 0 and 1, if i interpret my model is it the best fit for that with 0.11 %? and enough to say that the dependent variable made an improvement? also for the multivariate regression, is it necessary to check all the assumptions? – none Jan 22 '23 at 17:21
  • Your $R^2$ corresponds to explaining. $11%$ of the variance, not $0.11%$. // The inquiry about checking assumptions really warrants a separate posted question. – Dave Jan 22 '23 at 17:23