2

The model is linear $y_i = a\cdot x_i + b + e_i,~ i = 1,2,\ldots,N $. It is given that the noise is heavy tailed. However the distribution of noise conditional on $x$ is the same for all data points. My question is that how should I model the data generating process? Should I use a Student-t distribution for the noise process? Should I use M estimator in R? Facts: 1. OLS is not to be used. 2. 2. The noise distribution can depend on $x$, but is independent across samples. 3. conditioned on the value of $x$, noise's mean is 0. More clarification: the distribution of noise at $x$, i.e., $N(x)$ is a function of $x$. However for different data samples $i=1,2,\ldots,M$, the distribution of noise samples remains the same.

  • 1
    Is this an empirical problem (i.e. fitted model with assumptions being violated) or a theoretical (math) problem? – Jon Sep 21 '16 at 23:44
  • 1
    I am given a data set containing $x_i,y_i$ and I have to estimate $a,b$ – user131929 Sep 22 '16 at 00:50
  • See http://stats.stackexchange.com/questions/66173/regression-model-of-large-correlated-heavy-tailed-data http://stats.stackexchange.com/questions/154489/pvalues-of-glm-coefficients-and-heavy-tailed-distributed-residuals http://stats.stackexchange.com/questions/259772/how-to-analyze-random-variables-with-non-normal-distribution/259934#259934 http://stats.stackexchange.com/questions/26235/are-regressions-with-student-t-errors-useless – kjetil b halvorsen Feb 24 '17 at 08:20

1 Answers1

-1

You could use iterative feasible generalized least squares.

Start by setting weights for each datapoint to 1, i.e. no weighting, and use the following algorithm:

  1. Fit a weighted regression model for each dataset using weights.
  2. Create a single dataset combining squared residuals/errors, $e_i^2$ and their respective $x$ values.
  3. Fit $e_i^2 = a\cdot x_i + b$. If the noise is zero mean, $e_i^2$ is equal to the variance of the error at $x_i$.
  4. Update your weights with the squared errors prediction model
  5. Go back to 1 until convergence.