0

I am attempting to generate a confusion matrix with predicted data and actual data. I receive an error that the levels are not equal and I receive the error when both variables are read as factors. When I checked the levels I believe the issue is because the test data has many repeated values and thus a lower number of levels than the predicted values which are all unique. Is there a way to force the level of test data such that it will be equal to the predictions?

confusionMatrix(as.factor(sale.pred),as.factor(housing.test.df$SalePrice))

sale.pred are the forecasted values and housing.test.df$SalePrice are the actual values. As stated, sale.pred has no duplicate values and so its level is equal to the number of rows but housing.test.df$SalePrice has duplicate values and so its number of levels is < n as the number of rows.

Base_R_Best_R
  • 1,749
  • 1
  • 9
  • 19
B Bow
  • 9
  • Check the following previously asked questions https://stackoverflow.com/questions/30002013/error-in-confusion-matrix-the-data-and-reference-factors-must-have-the-same-nu https://stackoverflow.com/questions/24801452/error-in-confusionmatrix-the-data-and-reference-factors-must-have-the-same-numbe – Nareman Darwish Jan 04 '20 at 20:09
  • Read https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example and into the https://github.com/tidyverse/reprex package – Bruno Jan 04 '20 at 20:34
  • I tried manually setting the levels between 60,000 and 755000 which seems to be the range sale.pred = 2^31 elements – B Bow Jan 04 '20 at 20:43

0 Answers0