Suppose I have heteroscedastic data in which error terms increase for larger data points.
Assuming that either of these appear to fit the data well, which is the correct model to use, and why?
$Y = \beta X + \epsilon X$
or
$Y = \beta X + \epsilon Y$
where in both cases $\epsilon$ comes from $\mathcal{N}(\mu=1,\sigma)$
For prediction the second model is clearly less useful as the value of $Y$ is unknown, but if we are doing inference is there any advantage to it - such as, for example, the ability to use the fitted value of $\sigma$ as a measure of model performance?