Optimizing parameters via simple visualisation technique

Question

I hope I am at the right place, I was pointed here for further feedback from stackoverflow.
I would like to ask you, whether it makes sense to you what I am trying to do in order to optimize my model.

Here is my problem I am building ranking, based on Elo approach. I have extended it with 2 additional parameters (basic form has only one - K) and I would like to check, which values for those parameters make the best predictions. In order to achieve this, I decided to make some visualisations. I am calculating (predicted score - real score)^2 for each game, in which all players had already played 20 games. Predicted score is the probability that the player will win (in a long run), and the real score is 1 for win, 0.5 for draw and 0 for lose. I decided to test several values, that I guess will work best. I have processed my whole dataset 75 (3 different values for parameter A, 5 different B and 5 different C = 3 * 5 * 5 = 75 ) times and I stored the following results in the .csv file (the file name contains information about values for parameters that were used):

 ID, (predicted score - real score)^2, number of observation

ID is just the number of the cycle - in one cycle I process several thousands rows. I use the second and third column to determine what was the average error in predictions - I divide 2nd column by 3rd one. It is the metric, that I would like to use to compare different values of parameters.

To sum up my questions:

Does my way of determining whether my model is reasonable make sense to you from statistical point of view (namely calculating (predicted score - real score)^2/(number of observation)?
What are your suggestions to compare obtained results in order to determine the best values for parameters? Should I make few plots in 1 chart and color them on the grounds of parameters or is there a better way?

Ad 1. Your formula vaguely represents the variance of regression, and related indicators (such as R²) are commonly used as quality criteria. Ad 2. Why would you apply a visual method for parameter optimization? Why don’t you apply some sort of regression analysis to determine those parameter values that minimize the standard error of regression? — mzuba, Sep 07 '11 at 08:31
@mzuba I am aware that there exist approaches like stochastic gradient descent, however I do not really have time to learn it - i won't probably ever come back to this topic once I finish this academic task. I am aware that visual method is not really appriopriate, but it is still better than pure guess. I would be very eager to spend even few days on learning the propare way, however I doubt I will be sure what I am doing. That is why I prefer to use simpler approach, which I understand. Could you please explain why you used the word "vaguely" and how I can improve my approach? — mkk, Sep 07 '11 at 09:31

mzuba · Accepted Answer · 2011-09-07T10:42:52.097

Ad 1. Your formula vaguely represents the variance of residuals, and related indicators (such as R²) are commonly used as quality criteria.

R², which is commonly used to describe the goodness-of-fit of a model, compares the ratio of the variance of the score that is explained by the ELO-ranking to the variance that is not explained by it. The formula is:

$R² =\frac{ESS}{TSS}$

while

$ESS = \sum ($predicted score$ − $true score$)²$

$TSS = \sum ($true score$ − $average true score$)²$

This R² should be corrected to account for the number of parameters and observations:

$\bar{R²} = 1-(1-R²)\frac{n-1}{n-p-1}$ where $n$ is the number of observations and $p$ is the number of parameters.

One option is to use the model with the highest R², but others may know more about model selection.

Ad 2. Why would you apply a visual method for parameter optimization? Why don’t you apply some sort of regression analysis to determine those parameter values that minimize the standard error of regression?

It seems to me you already put quite a lot of effort in testing your rank (75 runs and a number of csv files and plots). If you specify how your new ranking scheme is calculated, and how the data you base it on is structured, I bet someone will walk you through the necessary steps to perform a statistical analysis.

If you want to stick with visualisation, I would plot the values of your “residuals”, i.e. the predicted scores minus the actual ones, or distributions thereof.

Finally, I wonder whether chess scores can be objectively assessed at all. If one player wins against another one, does this always mean he should have had better rating? I would be interested to know what your rating is based on.

@mzuba I believe I am counting both ESS and TSS correctly, however for most of the cycles the R^2 is above 1, which I believe means there is an error. How can I understand this? — mkk, Sep 07 '11 at 14:30

Optimizing parameters via simple visualisation technique

1 Answers1

Linked