3

suppose I have a linear regression model that was trained on a dataset of size $n$. Unfortunately, I no longer have access to the original dataset; the only thing stored is the model parameters $B_0, B_1$. Suppose I come across another dataset (from the same distribution) of size $m$. Is it possible to combine the models, ie find the OLS parameters as if I had access to all $m + n$ points and used them for training?

Bepop
  • 307

1 Answers1

4

Yes. To update it, you can use an algorithm for online (or streaming) linear regression. If I understand you correctly, you have parameter estimates for simple linear regression (one feature and intercept), in such a case, to update it you just need an algorithm that will update the covariance in a streaming fashion, as described in Are there algorithms for computing "running" linear or logistic regression parameters?. There are also available algorithms if you have more features. Finally, you can use the known parameters to set priors for a Bayesian linear regression to achieve the same thing.

However, keep in mind that this will work if you want both regressions to have the same features. If the new dataset contains more or fewer parameters, the estimates you already have may not be useful, as with different sets of features the parameters may change.

Tim
  • 138,066