I hope I am at the right place, I was pointed here for further feedback from stackoverflow.
I would like to ask you, whether it makes sense to you what I am trying to do in order to optimize my model.
Here is my problem I am building ranking, based on Elo approach. I have extended it with 2 additional parameters (basic form has only one - K) and I would like to check, which values for those parameters make the best predictions. In order to achieve this, I decided to make some visualisations. I am calculating (predicted score - real score)^2 for each game, in which all players had already played 20 games. Predicted score is the probability that the player will win (in a long run), and the real score is 1 for win, 0.5 for draw and 0 for lose. I decided to test several values, that I guess will work best. I have processed my whole dataset 75 (3 different values for parameter A, 5 different B and 5 different C = 3 * 5 * 5 = 75 ) times and I stored the following results in the .csv file (the file name contains information about values for parameters that were used):
ID, (predicted score - real score)^2, number of observation
ID is just the number of the cycle - in one cycle I process several thousands rows. I use the second and third column to determine what was the average error in predictions - I divide 2nd column by 3rd one. It is the metric, that I would like to use to compare different values of parameters.
To sum up my questions:
- Does my way of determining whether my model is reasonable make sense to you from statistical point of view (namely calculating (predicted score - real score)^2/(number of observation)?
- What are your suggestions to compare obtained results in order to determine the best values for parameters? Should I make few plots in 1 chart and color them on the grounds of parameters or is there a better way?