2

I am reading this paper and need to replicate what they did in Table 4:

However, I am having trouble understanding what are local and global polynomial regressions.

Could someone please explain? If you need more context, you can see the link and see page 8, right above Figure 5: "We do this exercise for the six global and the two local polynomial regressions"

  • 1
    I assume global means fitted to all the datapoints and local is fitted either side of the breakpoint but the paper is too long and detailed for me to read right now. – mdewey Apr 08 '23 at 12:50

1 Answers1

6

A global polynomial regression tries to fit the entire data set with a single polynomial. This leads to many problems, explained on this page and, in more technical detail, on this page. The question on the latter page cites the same Gelman and Imbens paper that you do. Frank Harrell's answer is a brief, simple summary of the problems. A major problem with a global regression is that any single point can have a large influence on the fit far away from it. Unless you know that you have the correct form for the polynomial that can lead to problems.

A local regression instead fits a series of restricted, local ranges of the data. That way, points don't affect the behavior of the curve far from their own locations. This is combined with a mechanism to connect the local fits, often with some constraint on the smoothness of the connections. In the context of regression discontinuity discussed by Gelman and Imbens, the most important range of the data is that close to the threshold for discontinuity. Thus the specific situation they cover is when

researchers discard the units with $x_i$ more than some bandwidth $h$ away from the threshold and estimate a linear or quadratic function on the remaining units...

More generally, beyond regression discontinuity studies, there are several ways to do local polynomial (including linear) regressions, including weighted regressions like loess (where the weights versus distance aren't necessarily the all-or-none type mentioned in the above quote) and several types of splines whose differences are outlined here.

EdM
  • 92,183
  • 10
  • 92
  • 267