2

Suppose I have data points of the form (x,y) and I know that CV should be constant across all x values. (i.e. heteroskedastic data). That being said, how can I compute the estimated CV if I don't have multiple y values for each x?

Typically, CV is computed on discrete data (ex. repeatedly measuring y values for some given x). But in this case, not all x values have multiple y values. Note that constant variance can be assumed throughout the domain of x.

If anyone could direct me to literature that handles scenarios as this, that would be great.

CodeGuy
  • 473

1 Answers1

1

You might fit a robust line (or one of some form that's otherwise hetero-consistent) and then model the squared residuals against $x$ in order to model the variance. Those squared residuals could then be used use that to update weights.

(another way to get starting values would be to consider a transformation to approximately constant variance)

Another approach, possibly simpler, would be to use glms (say a gaussian family with a $\mu^2$ variance function).

Edit: yet another approach: Divide through by $\sqrt{x}$ - let $y^* = y/\sqrt{x}$, and $x^* = x/\sqrt{x}$ (equivalently, $x^* = \sqrt{x}$) and the constant predictor become $x_0^* = 1/\sqrt{x}$. Regress $y^*$ on $x_0^*$ and $x^*$ with no intercept (because $x_0$ represents it). (:end edit)

Finally, one could write the weighted least squares criterion as a function of the mean parameters which also appear in the variance function and calculate the parameters as the solution to a more general optimization problem.

Some possible references for what you were suggesting:

1) The basic iterative (re-)weighted least squares algorithm (IWLS or IRLS), used to fit all manner of things from nonlinear regression to GLMs. It applies to your situation as a special case. If you don't iterate to convergence, you just refer to it as a two-step estimator (estimate model, get residuals, estimate variance, re-estimate model)

2) You might be able to use the approach in White hetero-consistent estimation (which has a kind of connection to (3) below as an argument for what you want to do.

3)

Here's an approach relating to estimating the variance function, using a method I've seen many times but only today spotted a real reference for:

The idea is you fit some model and take logs of squared residuals (since squared residuals will approximate the variance) and estimate a function to those. These will let you back out relative variances* and hence relative weights (by inverting relative variances) for the original regression.

This might count as sufficient for your case, which is a special case of this.

Wasserman, Larry (2006). All of Nonparametric Statistics. Berlin: Springer- Verlag. (see p87-88)

*(though if I recall correctly, I think in the linear case Geoff Eagleson established that this is biased for the intercept term so doesn't give a good idea of absolute variances.)

--

Another possibility, though I don't have it to hand to double check, is I think it might be covered in Chapter 4 of Sheather's book A Modern Approach to Regression with R

Glen_b
  • 282,281
  • the variance is not constant, since CV is constant as x (the mean) increases... – CodeGuy Mar 24 '13 at 19:51
  • Can you explain which part of my answer suggests I thought your original variance was constant, rather than the coefficient of variation? – Glen_b Mar 24 '13 at 21:09
  • Sorry, I wasn't referencing part of your answer, I just wanted to clarify that! I was thinking to fit a regression line, compute errors, and find the SD of the error distribution. What do you think of that? – CodeGuy Mar 25 '13 at 01:39
  • You mean, along the lines of my first option (the first paragraph), but without starting with an estimate robust to heteroskedasticity, nor updating the line after initially estimating the variance-function? Are you asking why one would bother with those particular additions? – Glen_b Mar 25 '13 at 01:48
  • I suppose, yeah. How would I go about implementing your first answer? Has anyone done this in the literature before, such that you could direct me to an example? If I use this method (the first part of your answer), I'd like to cite it. – CodeGuy Mar 25 '13 at 01:49
  • I'll see if I can get you some kind of citation for it. Oh, I am going to add another possibility to my list - check it in a while. – Glen_b Mar 25 '13 at 01:54
  • Hi there, any updates on finding a citation? Thanks! – CodeGuy Mar 25 '13 at 18:36