Uncertainty of extrapolation (curve fitting)

Question

I have estimated (MC simulated) some probability values y, that each depends on a value of x between 0 and 1.

Say, for instance, that the vector x contains

$x_{1} = 0.1,\ \ x_2 = 0.2,\ \ x_3 = 0.3,\ \ x_4=0.4,\ \ x_5 = 0.5,\ \ x_6 = 0.6,\ \ x_7 = 0.7$

and the vector of estimated mean values y contains

$y_{1}(x_1) = 0.340,\ \ y_2(x_2) = 0.329,\ \ y_3(x_3) = 0.322,\ \ y_4(x_4)=0.299,\ \ y_5(x_5) = 0.278,\ \ y_6(x_6) = 0.255,\ \ y_7(x_7) = 0.237.$

I also have an estimated standard deviation value for each of the estimated values in y. Let these uncertainty values be contained in a vector s.

Now I estimate $y(1)$ based on extrapolation (curve fitting). I do this by fitting my values to a curve on the form

$y(x) = q \exp \left\lbrace -a\left(x-b\right)^c \right\rbrace$

by minimizing a weighted mean square error function (with weights according to s) in order to find the optimal parameters $q, a, b, c$.

Then I have obtained my estimate $y(1)$. However, my question is how to measure the uncertainty in this estimate. Is there for instance any good way to exploit the values in s?

So far, I have thought of this way: The values in s was actually estimated by

$s_i(x_i) = \sqrt{\frac{1-y_i(x_i)}{Ny_i(x_i)}}$

where $N$ was the sample size in the MC simulation of $y_i(x_i)$. Then I thought I could perhaps simply use

$s(1) = \sqrt{\frac{1-y(1)}{Ny(1)}}$

as a measure of the uncertainty in $y(1)$, even though I never actually estimated $y(1)$ using $N$ samples in that way. Also, it's a drawback that this value doesn't depends on the size of y.

Any suggestions? It would be great to obtain some sort of standard deviation value s(1), so that I can easily find the relative error $RE = s(1)/y(1)$ as well. (I have some other methods that I use to estimate the probability y(1), and it would be nice to compare $s(1)$ and/or $RE$ of the various methods...)

Thank you! :-)

Have you noticed that this model for $y(x)$ is not a good fit to the values you give? — whuber, Jun 05 '12 at 17:38
Yes, I guess I could have given a better example. I actually have (about) 40 $y$-estimates corresponding to $x$-values in the interval $x_1 = 0.2$ to $x_{40} = 0.6$, and those I gave are the first seven of them. — moonlight, Jun 05 '12 at 18:27
That suggests you be extremely cautious about extrapolating so far using an exponential model that doesn't fit the data! Do you have any theoretical basis for supposing that this particular functional form holds throughout the interval from $0.2$ all the way to $1.0$? — whuber, Jun 05 '12 at 19:12
The value $y(1)$ is in my case often a rare even probability, meaning it is very small. I've read an article that suggests that model on various reliability problems that are very similar to my problem. Idealy I should have $y_i(x_i)$ for $x_i$ closer to 1, but these values are so small, it is very hard to estimate them by just using MC. But on the plots I've seen so far, the model seems to fit the data. What is so dangerous about an exponential model? Btw, thank you for answering/helping me!:):) — moonlight, Jun 05 '12 at 19:15
And I know this method might not be 100 % trustworthy, that's just why a measurement of the error would be great to have:) — moonlight, Jun 05 '12 at 19:21
This is the article: http://www.sciencedirect.com/science/article/pii/S0167473009000186 — moonlight, Jun 05 '12 at 19:41
That explains why the fit is so poor: you have not specified the same model! You need to replace $x$ with $\log(x)$. Moreover, the fitting really should be done for $\log(y)$ vs $\log(x)$ (which has a different error structure). — whuber, Jun 05 '12 at 20:57
Hm? There is no $\log(\lambda)$ (=$\log(x)$) in equation (10) at the second page in that article? I have minimized expression (11) using standard optimization methods in Matlab. There is no $\log(\lambda)$ there either. (They take log of both sides, then find the difference, then squaring, then multiplying with a weight, then summing. A reasonable way to find optimal parameters?) My model is still given by (10), is it not?
Do you have a suggestion on how to measure the error in $x = 1$? If I am able to do this, it should reflect a poor (not trustworthy) fitting anyway, should it not? :-) — moonlight, Jun 06 '12 at 08:11
My two cents: I am afraid that you cannot base extrapolation uncertainty on a poor fit model. Calibration uncertainty assume that the model is statistically valid (e.g. good chi2 and residuals statistics). To estimate a prediction uncertainty in this case, you would have to choose alternative models and estimate how your set of plausible models deviate at $x=1$... Hope this helps. — Pascal, Dec 13 '16 at 09:04

Uncertainty of extrapolation (curve fitting)

0 Answers0