2

Assume we have a cross-section of $N$ stocks. $Y_i$ is an sample variance estimate of stock returns for stock $i$. This sample variance is estimated using $T_i$ number of observations. All $T_i$ are not necessarily equal, i.e. the sample size for $Y$ estimation differ for i = 1,2,.., N.

Now I want to run a cross-sectional weighted least squares regression:

$Y_i = \beta X_i + \epsilon_i$

What is the best choice of weights here, such that the weights are based on $T_i$ for each $Y_i$. In other words, I want to assign a smaller weight to stock $i$ if $T_i$ is small.

Mayou
  • 957
  • 1
    If I understand it well, then you have $N$ stocks, and within each of these stocks you have observations $(x_{ij}, y_{ij})$, $i=1 \dots N, j=1 \dots n_i$ ? So I would suggest you to use the generalised least squares estimator. An R-implementation can be found in package nlme, function 'gls' where your grouping variable is the stock. You know R ? –  Nov 19 '15 at 07:14
  • As the the user before me wrote, look up weighted least squares and generalized least squares. They are the canonical solutions to this problem, I believe. – Eike P. Aug 03 '21 at 21:33

3 Answers3

3

I don't think there's a single optimal weight scheme here. I'd try first $w_i=\frac{NT_i}{\sum_iT_i}$. This way $\sum_iw_i=N$ and if $T_i=T_j\to w_i=1$, nice qualities.

Aksakal
  • 61,310
  • Thanks for your answer. Two follow-up questions please: 1) Do weights need to be normalized in weighted-least squares? 2) What do you think about $w_i = \sqrt{T_i}$ – Mayou Dec 05 '14 at 15:12
  • No, the weights don't have to be normalized, but it's nice to have them to be equal to 1 when num of obs are equal because then SSE will be the same as in OLS. It's just easier to compare and track the results. Square of $T_i$ is good too, because it links to random walk properties of volatility/time. – Aksakal Dec 05 '14 at 15:16
  • Great thank you. So in your opinion, what is the difference (advantage) of using $\sqrt{T_i}$ instead of $T_i$ as weights? – Mayou Dec 05 '14 at 15:28
  • Stocks with larger samples will have less impact than in linear weight – Aksakal Nov 11 '16 at 14:34
  • Unfortunately, these are the wrong weights to use, because the formula assumes all variation in the dependent values is due to measurement error. In effect, it assumes the regression model is perfect and has no error at all. If you had a prior sense of the variance $\sigma^2$ of the regression error, you could add it to each $N_i$ and use their inverses in a weighted regression. There are other issues to deal with, too, not least of which is the likely highly positively skewed distributions of the individual variance estimates. – whuber Feb 13 '22 at 17:38
0

Yi (sample variance estimate of stock returns for stock i) is going to be too volatile. Replace it with a robust estimator like Median Absolute Deviation (M.A.D) in the weight function. I employed the latter successfully in a solvency model for insurance companies.

Also, if you regressed the sample variance estimate of stock returns against log (capitalization), a measure of a company's size, you should get an inverse smoothing effect as large companies have, on average, lower volatility in earnings and a lower stock beta. I would combine this with the M.A.D estimate.

AJKOER
  • 2,180
0

If my description of what you are doing is wrong please please correct me: We are supposed to have a set of valus {Xi} and the corresponding {Yi}. From simple least squares Y = A.X + B. Then we compute the total variance V = Σ(Yi - A.Xi - B)^2. It's a kind of iteration. Then we repeat minimization of the variance functional using the weights:

Wi = Vi / (Yi - A.Xi - B)^2

But then some of the Wi's may be infinite. I don't like this.

  • Using math typesetting would make this easier to read. More information: https://math.meta.stackexchange.com/questions/5020/mathjax-basic-tutorial-and-quick-reference – Sycorax Feb 13 '22 at 17:25
  • 1
    I don't like this, either, because I have tried this and it doesn't work, even when the weights converge. The problem is that the residuals reflect both the measurement error of the response variables and the error term in the regression, but this approach doesn't correctly separate the two. – whuber Feb 13 '22 at 17:31
  • Well, if we don't like results it seems the LS approximation falls foul somewhere further down the road, in some other calculation. In my case it's the Shannon predictivity parameter algorithm where I transport the LS data. Too small Shannon parameter means the LS approximation was not efficient. – user143678 Feb 14 '22 at 19:37