Estimate how coefficient of variation changes as a function of $x$ without replication

Question

Suppose I have data points of the form (x,y) and I know that CV should be constant across all x values. (i.e. heteroskedastic data). That being said, how can I compute the estimated CV if I don't have multiple y values for each x?

Typically, CV is computed on discrete data (ex. repeatedly measuring y values for some given x). But in this case, not all x values have multiple y values. Note that constant variance can be assumed throughout the domain of x.

If anyone could direct me to literature that handles scenarios as this, that would be great.

Can you assume any relationship between the mean of $y$ and $x$, say linear? — Aniko, Mar 23 '13 at 22:37
Yes, linear can be assumed. In any case, a regression can be fit. — CodeGuy, Mar 24 '13 at 01:11

Glen_b · Answer 1 · 2013-04-15T07:54:59.950

You might fit a robust line (or one of some form that's otherwise hetero-consistent) and then model the squared residuals against $x$ in order to model the variance. Those squared residuals could then be used use that to update weights.

(another way to get starting values would be to consider a transformation to approximately constant variance)

Another approach, possibly simpler, would be to use glms (say a gaussian family with a $\mu^2$ variance function).

Edit: yet another approach: Divide through by $\sqrt{x}$ - let $y^* = y/\sqrt{x}$, and $x^* = x/\sqrt{x}$ (equivalently, $x^* = \sqrt{x}$) and the constant predictor become $x_0^* = 1/\sqrt{x}$. Regress $y^*$ on $x_0^*$ and $x^*$ with no intercept (because $x_0$ represents it). (:end edit)

Finally, one could write the weighted least squares criterion as a function of the mean parameters which also appear in the variance function and calculate the parameters as the solution to a more general optimization problem.

Some possible references for what you were suggesting:

1) The basic iterative (re-)weighted least squares algorithm (IWLS or IRLS), used to fit all manner of things from nonlinear regression to GLMs. It applies to your situation as a special case. If you don't iterate to convergence, you just refer to it as a two-step estimator (estimate model, get residuals, estimate variance, re-estimate model)

2) You might be able to use the approach in White hetero-consistent estimation (which has a kind of connection to (3) below as an argument for what you want to do.

3)

Here's an approach relating to estimating the variance function, using a method I've seen many times but only today spotted a real reference for:

The idea is you fit some model and take logs of squared residuals (since squared residuals will approximate the variance) and estimate a function to those. These will let you back out relative variances* and hence relative weights (by inverting relative variances) for the original regression.

This might count as sufficient for your case, which is a special case of this.

Wasserman, Larry (2006). All of Nonparametric Statistics. Berlin: Springer- Verlag. (see p87-88)

*(though if I recall correctly, I think in the linear case Geoff Eagleson established that this is biased for the intercept term so doesn't give a good idea of absolute variances.)

--

Another possibility, though I don't have it to hand to double check, is I think it might be covered in Chapter 4 of Sheather's book A Modern Approach to Regression with R

the variance is not constant, since CV is constant as x (the mean) increases... — CodeGuy, Mar 24 '13 at 19:51
Can you explain which part of my answer suggests I thought your original variance was constant, rather than the coefficient of variation? — Glen_b, Mar 24 '13 at 21:09
Sorry, I wasn't referencing part of your answer, I just wanted to clarify that! I was thinking to fit a regression line, compute errors, and find the SD of the error distribution. What do you think of that? — CodeGuy, Mar 25 '13 at 01:39
You mean, along the lines of my first option (the first paragraph), but without starting with an estimate robust to heteroskedasticity, nor updating the line after initially estimating the variance-function? Are you asking why one would bother with those particular additions? — Glen_b, Mar 25 '13 at 01:48
I suppose, yeah. How would I go about implementing your first answer? Has anyone done this in the literature before, such that you could direct me to an example? If I use this method (the first part of your answer), I'd like to cite it. — CodeGuy, Mar 25 '13 at 01:49
I'll see if I can get you some kind of citation for it. Oh, I am going to add another possibility to my list - check it in a while. — Glen_b, Mar 25 '13 at 01:54

Estimate how coefficient of variation changes as a function of $x$ without replication

1 Answers1