I am trying to calculate the $R^2$ value for a production constrained spatial interaction model, using Fotheringham and O'Kelly (1989) as my guide.
I get dramatically different values for R-Square, depending on whether I calculate it as r-square <- 1 - SSe/SSt or r-square <- cor(x, y)^2. Is this result expected? Of course, I may well be miscalculating this somewhere along the line.
I want to use r-square as a (flawed but nevertheless useful and widely understood) measure of goodness of fit, as recommended by Fotheringham & Knudsen (1987).
A reproducible example is below. I've saved my model output to a csv, to save space here.
predobs <- read.csv("http://dl.dropbox.com/u/66606821/pred_obs.csv")
sst <- sum((predobs$obs - mean(predobs$obs))^2)
sse <- sum((predobs$obs - predobs$pred)^2)
(r.square.1 <- 1 - (sse/sst))
(r.square.2 <- cor(predobs$obs, predobs$pred)^2)
lfit <- lm(predobs$obs ~ predobs$pred), the model I get has a slope of 0.56, and an intercept of -0.0387. If you incorporate this affine shift into your prediction, then the two forms should be equal. You can check this viar.square.3 <- 1 - (sum((lfit$residuals)^2) / sst)– shabbychef May 01 '12 at 04:38