0

I have a regression model where the dependent variable is the difference in income between adjacent towns i and j. The independent variables are also differences in other parameters between these two towns. This way all unobservables are controlled for. I have multiple pairs of towns. These towns are close by according to a definition. So if I have a,b,c towns as adjacent, then I would essentially have 3 pairs - a,b ; b,c and a,c. I am then finding the difference in income and other variables and running the regression with these differences.

Alternatively, I also wanted to introduce town fixed effects so that all variables as long as they belong to same town are controlled for. In STATA, I ran the following code: encode town, g(town_number) regress income x y z i.town_number

The R squared for the first and second model are significantly different. I understand that fixed effects model has many more independent variables and thus has a higher R squared. But even the adjusted R squared is higher for the fixed effects model. What is the reason for this? Thanks in advance.

  • 1
    Can you provide more detail on your model and variables. I'm a bit worried here that you might get some answers that may not really answer your question because of a difference in terminology in how statisticians and economists use the term "fixed effects." Rob Hyndman, has a nice article about this and writes, "A “fixed effect” in statistics is a non-random regression term, while a “fixed effect” in econometrics means that the coefficients in a regression model are time-invariant." See here under confusing terminology:

    https://bit.ly/3qVGz1X

    – StatsStudent Sep 04 '23 at 11:06
  • 1
    Maybe this is a variant on @StatsStudent 's question, but how can your DV be the difference in income between two towns? You would have only one value. – Peter Flom Sep 04 '23 at 11:09
  • That was my thinking too @PeterFlom. I'm not an economist, but I've come across the term "fixed effects" in working with economists before and it takes on an entirely different meaning and since this didn't really make much sense to me, I figured this might be the culprit. I think a more detailed explanation of the problem would help and is in order. – StatsStudent Sep 04 '23 at 11:11
  • 1
    Do you, perhaps, have multiple pairs of towns? – Peter Flom Sep 04 '23 at 11:14
  • And is this a "panel analysis?" – StatsStudent Sep 04 '23 at 11:15
  • The models use completely different response. The difference model usually has a lower R-squared. – Michael M Sep 04 '23 at 11:18
  • Apologies Yes, I have multiple pairs of towns. These towns are close by (defined a little differently in terms of administrative boundaries etc). So if I have a,b,c towns as adjacent, then I would essentially have 3 pairs - a,b ; b,c and a,c. I am then finding the difference in income and other variables and running the regression with these differences. Alternatively, I also wanted to introduce town fixed effects so that all variables as long as they belong to same town are controlled for. In STATA, I ran the following code: encode town, g(town_number) regress income x y z i.town_number – user584534 Sep 04 '23 at 11:30
  • Duplicate of https://stats.stackexchange.com/questions/444041/negative-adjusted-r2-in-twoway-effects-within-model ? – Christoph Hanck Sep 04 '23 at 12:30

0 Answers0