1

In a dataset, the unseen target value is 2500000000. In a polynomial regression with degrees from 1 through 7, I have the following results:

Degree: 1, Mean Squared Error: 410983500.2950994, Predicted: -81178512721.65013

Degree: 2, Mean Squared Error: 5829093.43611284, Predicted: -81178512721.65013

Degree: 3, Mean Squared Error: 0.006954321149912846, Predicted: -81178512721.65013

Degree: 4, Mean Squared Error: 0.9152467870231109, Predicted: -81178512721.65013

Degree: 5, Mean Squared Error: 19.06090147224352, Predicted: -81178512721.65013

Degree: 6, Mean Squared Error: 8651.727673046189, Predicted: -81178512721.65013

Degree: 7, Mean Squared Error: 78583221.21056704, Predicted: -81178512721.65013

The polynomial regression has the lowest error when degree = 3. But the predicted value is too erroneous. Moreover, in all the degrees, the predicted values remain same and negative. Can anybody throw light on this?

PS Nayak
  • 168
  • 1
    high degree polynomial regression without regularization is awful :) This has been addressed in some other questions, like this one: https://stats.stackexchange.com/questions/549012/why-is-the-use-of-high-order-polynomials-for-regression-discouraged – John Madden Oct 31 '22 at 14:38
  • 1
    Can the data be made available? This apparent numeric instability can probably be fixed with appropriate centering and scaling of the predictor variable. – JimB Oct 31 '22 at 14:49
  • @JohnMadden I went to the link you mentioned and found Regression Splines as a solution. I went through an article at https://www.analyticsvidhya.com/blog/2018/03/introduction-regression-splines-python-codes/ and found useful. Thank you. – PS Nayak Oct 31 '22 at 16:49
  • @JimB Sorry, the data cannot be made public for proprietary rights. I'll try your method. Thank you. – PS Nayak Oct 31 '22 at 16:51
  • 1
    The real fix is in the link of @JohnMadden 's comment. But if you really need polynomials, then centering and scaling should minimize numerical stability issues. Using orthogonal polynomials will also minimize numerical stability issues. – JimB Oct 31 '22 at 17:07
  • @JimB I do not specifically need polynomials, rather best fit. For that, I'll try all the suggestions mentioned. – PS Nayak Oct 31 '22 at 17:24
  • "Best fit" means nothing until you describe the space of possible functions you will permit to fit the data. Otherwise there is a bewilderingly large and varied set of optimal fits. – whuber Nov 01 '22 at 11:33

0 Answers0