I have a data set where some extreme, but not nonetheless important observations are present which prompts violation of the linear regression assumptions of normality and constant variance. The assumption of normality I read here: Regression when the OLS residuals are not normally distributed, is of less concern as long as we're not concerned with p-values or intervals.
The violation of constant variance however is discussed here: What does having "constant variance" in a linear regression model mean?, as well as here: Regression when the OLS residuals are not normally distributed, where Michael R.Chernick in the comments of his response writes
log transformation or Box-Cox with small lambda shrink the tails. That can work for some heavytailed and skewed distribution. I don't know what if any transformations will work for very heavy-tailed distributions.
As such I tried Box-Cox transformation, with optimal lambda = 0.1414141 and also using first difference rewriting of the response variable ($y_t - y_{t-1}$). I plotted the results of the optimal Box-Cox transformation:
This does not provide a solution to my situation, and the assumption of constant variance remains violated.
I would like some pointers towards what I can, if any, to fix this alternatively how I can model my data using other regression techniques.
Edit My modelling is about quantifying uncertainty of an agnostic prediction model, in a highly stochastic environment (travel times in traffic). For the two data points in the graph that normally would be considered outliers, the agnostic prediction model made a reasonable prediction. I have checked this by comparing the predictions versus the actual outcome. That motivates me to not treat them as outliers, as I assume the data fed forward to the prediction model was viable.
The agnostic prediction model have access to GPS position and speed when making its predictions about arrival times. I don't have access to the same data, but rather what I'm using here is the previous travel time as a modeling feature. The reason behind those data points being so extreme is unknown to me.
