smoother: "... prediction on x, which is unrelated to the values of x_i." What does it mean?

Question

My lecture notes says

Any practical implementation of a smoother is based on input in the form of a scatterplot $(x_i, y_i)_{i=1}^n$, on a tuning parameter $h$, and on a grid of output points $x$ where one would like to see the estimate (usually chosen as a sense regular grid on the domain of interest, and unrelated to the values of the $x_i$).

What does it mean by $x$ is unrelated to the values of $x_i$?

score 1 · Accepted Answer · answered Sep 07 '20 at 07:49

Conceptually, a smoother is a function from $x$ to $y$, which is defined at every $x$ value (or at least, every $x$ value close enough to a data point). There are infinitely many possible $x$ values, so that's not practical.

If you want the smoother in order to draw it on the scatterplot (or draw it instead of the scatterplot), you need it for at most as many $x$ values as you have pixels, and you can often make do with many fewer (maybe 100 or so) with linear interpolation done by the graph drawing process. The number of points you need and their spacing doesn't really depend on what data you were given, so in that sense it's unrelated to the values of the $x_i$.

More specifically:

you don't need to predict at every $x_i$, especially if there are lot of them
you may need to predict at some values $x$ not in $x_i$, especially if there are gaps in the distribution of the $x_i$.

It is common for smoothers to specify the vectors $\langle x_i, y_i\rangle$ as inputs and another vector called something like xout whether the smoother is to be evaluated. It is less universal, but not uncommon, for the default xout to be a uniformly spaced grid.

As a secondary note, I don't think the claim is strictly true. First, there are settings where an analytic formula for the smoother is a feasible output (eg, penalised splines) and where that is valuable to simplify taking derivatives and integrals. Second, it's not hard to find implementations where the default behaviour is to predict at the input values $x_i$, eg

I understand that we usually need xout for smoother to be drawn on a graph or make predictions. What I am still confusing is why these xout should "not related to $x_i$'s", like what my lecture note says. My understanding is even if $x$ is to some extent related to the $x_i$'s, we still can draw reasonable smoothing curves. — WCMC, Sep 07 '20 at 15:57
It's not that it has to be unrelated, it's that it can be unrelated. — Thomas Lumley, Sep 07 '20 at 21:12

smoother: "... prediction on x, which is unrelated to the values of x_i." What does it mean?

1 Answers1