1

Apologies if the question is too trivial but what exactly sets these two apart?

Let's say that I have a set of data for a hundred points (the independent variable may not be uniformly spaced) as:

{{1, 7}, {2, 8},...,{100, 5}}

Now, I can apply any of the extrapolation techniques (Newton's, Lagrange's or even Curve Fitting for that matter) and get a y = f(x). Now if I put in any x, in or out from my original data set, I can get the corresponding y. This way I predicted a y value which wasn't originally in my data set.

How is Prediction different from this?

Hyperbola
  • 111

2 Answers2

1

Extrapolation is estimation of dependent values outside the range covered by the (independent) data the model has been fit to: https://en.wikipedia.org/wiki/Extrapolation. It's not the same as interpolation, which is estimation between original data points. Prediction usually refers to future events, but in your context you could say (regarding the estimates) prediction is a hypernym of fitted values + interpolation + extrapolation.

Roland
  • 6,611
  • 2
    I understand the difference between extra/interpolation but that's not my question. Can you elaborate more upon Extrapolation vs Prediction? Your current answer (last statement) isn't satisfactory enough. – Hyperbola Apr 04 '16 at 12:11
  • 1
    There is no "vs". Extrapolation is prediction outside of the ranged covered by data, interpolation is prediction inside this range. – Roland Apr 04 '16 at 12:14
  • So prediction is just a term for saying extrapolation & interpolation. In other words, every prediction is either an extrapolation or an interpolation? – Hyperbola Apr 04 '16 at 12:16
  • 1
    Basically. Although prediction refers to, well, prediction of future observations (which is important for estimation of uncertainties) whereas that's not necessarily the case for extrapolation and interpolation. – Roland Apr 04 '16 at 12:23
  • 1
    It is not clear to me that all predictive models extrapolate. E.g., time series analysis models based on well-informed and empirically backed data-generating processes, and dynamic empirical modeling methods, such as simplex projection, would seem to be interpolation-based, not extrapolation based. – Alexis Dec 13 '22 at 20:43
  • 2
    To add to what @Alexis said, in spatial statistics interpolation and extrapolation are both considered to be prediction. They are often distinguished by whether the support of the prediction (the x-values) lies within the convex hull of the data or not, respectively. In one dimension that distinction comes down to whether the prediction is made for an x value within the data range or beyond it (in either direction). – whuber Dec 13 '22 at 20:55
  • @whuber Yes, one can call just about any statistical procedure a prediction, and that is misleading because it is often irrelevant or even wrong. The mean value of uniformly random integers from 1 to 6 is 3.5, which is not closed for integers, so 3.5 is a wrong prediction. A least error prediction from OLS in y predicts an equation that is wrong for bivariate data, but it offers a least error of "predicting" y given x. More accurately, it is an observed least error mapping of x to y. One can call any probability a "prediction," but more accurately it is a post hoc observation. – Carl Dec 14 '22 at 15:21
  • @Carl You misread my comments and misuse the terms. I have explicitly distinguished estimation from prediction. Guessing the value of a random variable is not merely a matter of estimating a probability related to its distribution. For details, please read our thread on this topic. – whuber Dec 14 '22 at 17:46
  • @whuber In your answer, that you refer to above, you state, "a predictor usually has larger uncertainty than a related estimator" I do not agree. OLS in y has less error of prediction of y given x and a higher R$^2$ than a bivariate estimating equation, which has better accuracy if the data is indeed bivariate. A bivariate estimating equation is not a least error predictor, the thing "predicted" is the bivariate equation, and is useful, for example, in clinical laboratories to replace test A with test B. It seems to me you are misusing terms by lumping different things into the same terms. – Carl Dec 14 '22 at 23:43
  • @Carl I invite you to look up these terms in any reputable textbook. – whuber Dec 15 '22 at 14:47
  • @whuber OK, This is more or less what I am using: From https://stats.oecd.org/glossary/detail.asp?ID=3792 "Definition: In general, prediction is the process of determining the magnitude of statistical variates at some future point of time. In statistical contexts the word may also occur in slightly different meanings; e.g. in a regression equation expressing a dependent variate y in terms of dependent x’s, the value given for y by specified values of x’s is called the “predicted” value even when no temporal element is involved." What are you using? – Carl Dec 15 '22 at 23:07
  • @Carl I avoid the econometric literature like that link, which employs some terminology narrowly (everything seems to revolve around time series), and rely on classic statistics texts like Hahn & Meeker, Statistical Intervals section 1.2, Wasserman's All of Statistics (especially exercise 13.10), or Kendall's Advanced Theory of Statistics, Volume 2, Fifth Edition, sections 28.9 - 28.12. I first learned of the distinctions between estimation and prediction in the spatial statistics literature, where that concept is fundamental and the "future" isn't around to confuse people. – whuber Dec 15 '22 at 23:23
  • @whuber I think you have a point but I feel like I am shadow-boxing. This is really another question even more nitty-gritty than what the OP asked here, and this text is not really helping anyone. Therefore, let us move this to a discussion on it own merit. https://stats.stackexchange.com/questions/599212/what-exactly-does-prediction-mean-in-best-statistical-terms-and-when-is-it-inapp – Carl Dec 15 '22 at 23:55
-2

The explanation is excellent, however, the casual association of the term prediction to future events may have been too subtle. In other words, I don't know that the implied other part of that explanation was derived. Prediction = future event, estimation = not necessarily future, i.e. fitted value, can be inferential (established population).