10

I have a dataset for which I have to develop various models and compute the adjusted R2 value of all models.

    cv = KFold(n_splits=5,shuffle=True,random_state=45)
    r2 = make_scorer(r2_score)
    r2_val_score = cross_val_score(clf, x, y, cv=cv,scoring=r2)
    scores=[r2_val_score.mean()]
    return scores

I have used the above code to calculate the R2 value of every model. But I am more interested to know the adjusted R2 value of every models Is there any package in python which can do the job?

I will appreciate your help.

Ahamed Moosa
  • 1,215
  • 6
  • 11
  • 28
  • Possible duplicate of [How to get Adjusted R Square for Linear Regression](https://stackoverflow.com/questions/51023806/how-to-get-adjusted-r-square-for-linear-regression) – jorijnsmit Oct 25 '19 at 18:45
  • Possible duplicate https://stackoverflow.com/questions/49381661/how-do-i-calculate-the-adjusted-r-squared-score-using-scikit-learn/49381947 – jorijnsmit Oct 25 '19 at 19:00

1 Answers1

23

you can calculate the adjusted R2 from R2 with a simple formula given here.

Adj r2 = 1-(1-R2)*(n-1)/(n-p-1)

Adjusted R2 requires number of independent variables as well. That's why it will not be calculated using this function.

min2bro
  • 4,086
  • 3
  • 24
  • 52
  • 3
    Thanks , so I assume n = number of sample size , p = number of independent variables – Ahamed Moosa Jun 26 '18 at 09:07
  • 3
    When we want to calculate adjusted R2 for each fold during cross-validation, will `n` correspond to the size of the dataset or the size of the fold? (e.g., 80% of the number of rows if we are doing 5-fold CV) @min2bro – nvergos Apr 25 '19 at 14:41
  • 2
    @nvergos n should correspond to the size of the fold. – jeffhale Aug 05 '20 at 17:34
  • Should I use `n`and `p` of train set if I am evaluating for train or test set. Or I should use `n`and `p` for train set if I am evaluating for train set and use test set `n`and `p` if I am evaluating for test set? – vasili111 Feb 24 '21 at 23:05
  • @vasili111 we check the model performance on test data, so its better to check the adjusted r2 and r2 on test data. – Girish Kumar Chandora Jun 27 '21 at 16:33