1

I have some data that looks like:

$x_i$ $y_i$
10 20
11 21
12 25
1000 2001

The current method for forecasting an unseen $y'$ based on a known $x'$ is to estimate it as: $$\hat y' = \frac{1}{n} \sum_{i=1}^n \frac{y_i}{x_i} x'$$

That is, we take the arithmetic mean of the ratio of $y$ to $x$ and apply it to $x'$.

This "model" is specified somewhat informally as a procedure and I'd like to restate it in more formal terms, so that I can determine whether other models that I am considering are generalizations of this model or different models.

So, is there a model of the form $y = \alpha x + \varepsilon$ or maybe $y = \alpha x \varepsilon$ or something, where the maximum likelihood estimator is the arithmetic mean of $y_i/x_i$?

2 Answers2

1

This looks like a standard ordinary least squares regression problem. Your $y$ is the dependent variable and $x$ is the independent/predictor variable. You'd formulate the regression problem as you stated, possibly with an intercept term too.

$$ y_i=\alpha x_i+c +\epsilon_i $$

The regression would estimate $\hat\alpha,\hat c$.

If you have a new $x_j$ you can make a prediction $y_j = \hat \alpha x_j + \hat c$.

This can be implemented using built in functions in R, Excel etc.

Alex J
  • 2,151
  • I don't think that this can be expressed as OLS, at least without some transforms. Under OLS, the errors ($\varepsilon$) are additive and the large observation (1000, 2001) will dominate the estimation of the parameters. Under the ratio method, the large observation is just another observation that is near the value 2.0. I wonder if there is some weighted regression with would accommodate this. – Jamie Ballingall May 09 '23 at 16:09
  • For comparison, see this other question. It gives a formula for $\hat \beta_1$ (what we would call $\hat \alpha$) that is different from ours. – Jamie Ballingall May 09 '23 at 16:13
  • Are your $y$ always positive integers? If so have you considered a Poisson generalised linear model? There is also the ratio estimator, which often comes up in the context of survey data analysis. One of them might be helpful. – Alex J May 09 '23 at 22:57
0

So I think what I was looking for was: $$\frac{y}{x}=\alpha+\varepsilon$$ which rewrites to: $$ y=\alpha x + x \varepsilon $$ I think this could be expressed as a weighted least squares regression with $1/x$ as the weights.