3

I am using the plm() function in R with a fixed effects model:

fixed <- plm(Y ~ X, data = pdata, index = c("sireID", "cropNum"), model = "within")

The output is then:

> summary(fixed)
Oneway (individual) effect Within Model

Call: plm(formula = Y ~ X, data = pdata, model = "within")

Unbalanced Panel: n = 256, T = 1-16, N = 1039

Residuals: Min. 1st Qu. Median 3rd Qu. Max. -12.2822 -2.1441 0.0000 1.8906 32.6836

Coefficients: Estimate Std. Error t-value Pr(>|t|)
X1 0.643017 0.042771 15.0340 <2e-16 *** X2 1.945723 1.745879 1.1145 0.2654


Signif. codes: 0 ‘*’ 0.001 ‘’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Total Sum of Squares: 17203 Residual Sum of Squares: 13330 R-Squared: 0.2251 Adj. R-Squared: -0.029887 F-statistic: 113.439 on 2 and 781 DF, p-value: < 2.22e-16

I want to obtain the intercept for this fixed effects model. Is this possible to do using plm in R? I tried using the solution given here by adding effect = "twoways" to the plm() function but this didn't produce an intercept term.

Thomas Bilach
  • 5,999
  • 2
  • 11
  • 33
  • 2
    Well, read the post you linked till the end and look at the linked within_intercept function of package plm (and the literature cited for the theoretical background). – Helix123 Aug 03 '21 at 09:27
  • 1
    @Helix123 You're correct. I updated my answer in case the OP is looking for this. The overall intercept is just a weighted mean of all the state effects. I'm not actually sure why this is useful though. Any thoughts? – Thomas Bilach Aug 03 '21 at 22:42

1 Answers1

4

The fixed effects model your estimating is akin to estimating a separate intercept for each sireID. The unit-specific intercepts don't appear in your summary output, but you can ask for them with the following function: fixef(fixed).

Estimating fixed effects along the other dimension (e.g., effect = "twoways) isn't going to return a global intercept. The main takeaway from the post you referenced is that the overall intercept $\alpha$ is perfectly collinear with $\alpha_i$, as the sum of all the unit-specific effects (i.e., $\alpha_i$'s) is one.

Estimating your equation using lm() and including dummies for all units will return an intercept, but the intercept is simply the effect for the omitted dummy. To see this in practice, I will use the Produc dataset (U.S. state production from 1970 to 1986) from the plm package.

# Load the plm package and the Produc dataset

library(plm) data("Produc", package = "plm")

Estimate a one-way fixed effects model

Extract the state-specific intercepts

mod_within <- plm(hwy ~ gsp, index = c("state", "year"), model = "within", effect = "individual", data = Produc) summary(fixef(mod_within))

The fixef() function extracts the fixed effects from a plm object. The output below reports the unique state-specific intercepts. Remember the effect for Alabama in the first row.

# Summary output from the fixef() function
# These are the state-specific intercepts...
         Estimate Std. Error t-value  Pr(&gt;|t|)    

ALABAMA 7390.71 185.97 39.742 < 2.2e-16 *** ARIZONA 4749.54 180.96 26.246 < 2.2e-16 *** ARKANSAS 3761.83 175.90 21.387 < 2.2e-16 *** CALIFORNIA 37091.91 697.28 53.195 < 2.2e-16 *** COLORADO 4501.00 186.95 24.076 < 2.2e-16 *** CONNECTICUT 6142.66 192.52 31.907 < 2.2e-16 *** DELAWARE 1897.09 171.51 11.061 < 2.2e-16 *** FLORIDA 13175.73 266.92 49.362 < 2.2e-16 *** GEORGIA 8331.90 208.42 39.977 < 2.2e-16 *** IDAHO 2369.09 171.82 13.789 < 2.2e-16 *** ILLINOIS 22536.50 360.88 62.449 < 2.2e-16 *** INDIANA 8559.26 213.81 40.033 < 2.2e-16 *** IOWA 8951.42 184.49 48.520 < 2.2e-16 *** KANSAS 5983.13 181.48 32.969 < 2.2e-16 *** KENTUCKY 9925.08 187.95 52.808 < 2.2e-16 *** LOUISIANA 9972.78 217.73 45.804 < 2.2e-16 *** MAINE 2205.62 172.15 12.812 < 2.2e-16 *** MARYLAND 8259.07 198.04 41.704 < 2.2e-16 *** MASSACHUSETTS 7203.77 225.36 31.965 < 2.2e-16 *** MICHIGAN 16473.75 288.12 57.178 < 2.2e-16 *** MINNESOTA 10479.91 199.22 52.604 < 2.2e-16 *** MISSISSIPPI 5435.86 176.62 30.777 < 2.2e-16 *** MISSOURI 9406.42 206.89 45.465 < 2.2e-16 *** MONTANA 3644.11 171.82 21.208 < 2.2e-16 *** NEBRASKA 4359.62 175.03 24.907 < 2.2e-16 *** NEVADA 1969.22 172.23 11.434 < 2.2e-16 *** NEW_HAMPSHIRE 1965.13 171.93 11.430 < 2.2e-16 *** NEW_JERSEY 10672.72 262.80 40.612 < 2.2e-16 *** NEW_MEXICO 3158.47 174.12 18.140 < 2.2e-16 *** NEW_YORK 29833.23 529.86 56.304 < 2.2e-16 *** NORTH_CAROLINA 7987.01 212.46 37.593 < 2.2e-16 *** NORTH_DAKOTA 2746.73 171.58 16.008 < 2.2e-16 *** OHIO 21086.53 317.17 66.484 < 2.2e-16 *** OKLAHOMA 5029.61 188.38 26.700 < 2.2e-16 *** OREGON 5566.66 180.65 30.815 < 2.2e-16 *** PENNSYLVANIA 21292.22 325.81 65.351 < 2.2e-16 *** RHODE_ISLAND 1783.01 172.00 10.366 < 2.2e-16 *** SOUTH_CAROLINA 3806.90 179.89 21.162 < 2.2e-16 *** SOUTH_DAKOTA 3024.36 171.36 17.650 < 2.2e-16 *** TENNESSE 9005.20 195.23 46.125 < 2.2e-16 *** TEXAS 24581.81 464.77 52.890 < 2.2e-16 *** UTAH 3296.01 173.52 18.995 < 2.2e-16 *** VERMONT 1792.38 171.08 10.477 < 2.2e-16 *** VIRGINIA 11836.57 212.64 55.666 < 2.2e-16 *** WASHINGTON 9110.27 198.90 45.804 < 2.2e-16 *** WEST_VIRGINIA 6054.88 175.35 34.530 < 2.2e-16 *** WISCONSIN 9116.06 203.39 44.820 < 2.2e-16 *** WYOMING 2703.02 171.94 15.721 < 2.2e-16 ***

Estimating dummies for each state using lm() results in equivalent estimates. However, we must drop a state since all the state dummies sum to one. The state variable in the Produc dataset is a factor variable; it's ordered alphabetically. R drops the first factor by default, which is Alabama.

# The least squares dummy variable (LSDV) estimator
# Include a dummy variable for each unit (i.e., state)

mod_lsdv <- lm(hwy ~ gsp + as.factor(state), data = Produc)

Extract the intercept

mod_lsdv$coefficients['(Intercept)'] (Intercept) 7390.71

Note how the indicator for Alabama doesn't necessarily get dropped; it actually gets absorbed into the intercept. The omitted state effect could be any state in the panel.

In short, the intercepts usually aren't the focus in a fixed effects analysis; they're nuisance. In your case, the principal focus should be on X1 and X2.


In accordance with the suggestion provided by @Helix123, I wanted to note that there is, in fact, a within_intercept() function in the plm package. I always found it a bit artificial but it does return an "overall intercept" for within models and its accompanying standard error.

Under the hood, it's actually a weighted mean of the fixed effects. You could calculate this fairly easily in the one-way case using the following code:

# The overall intercept is just a weighted mean of the state effects

weighted.mean(fixef(mod_within), pdim(mod_within)$Tint$Ti)

Simply feed the weighted.mean() function the individual state effects, followed by a numeric vector of the weights, and you got your overall intercept.

Thomas Bilach
  • 5,999
  • 2
  • 11
  • 33
  • 1
    Have a look at plm's within_intercept and ?within_intercept (esp. the examples section) for the relation of fixef to the overall intercept of a fixed effect model. – Helix123 Aug 03 '21 at 09:37