To begin with you have to define the equivalence concept. One may think that two models are equivalent when they do produce almost the same forecasting accuracy (this one would be relevant for time series and panel data), another one could be interested in if the fits from the model are close. The former is the object for different cross-validation (jack-knife usually or some out-of-sample tests, Rob's accuracy() do this nicely), the latter goes for the minimization of some information criterion.
In microeconometrics the choice is $BIC$, though you may also consider $AIC$ if you are working with small sample sizes. Note, that the choice based on minimization of information criterion is also relevant for nested models.
A nice discussion is given in must-have-it book by Cameron and Trivedi (Chapter 8.5 provides excellent review of the methods), more specific theoretical details are found in Hong and Preston here.
Roughly speaking, choosing from two models the more parsimonious (having less parameters to estimate, therefore more degrees of freedom) will be suggested as preferable. An information criterion introduces a special penalty function that restricts the inclusion of additional explanatory variables into linear model conceptually similar to the restrictions introduced by adjusted $R^2$.
However you may not be just interested in choosing the model that minimizes selected information criterion. Equivalence concept implies that some test statistic should be formulated. Therefore you may go for likelihood ratio tests either Cox or Voung $LR$ tests, Davidson-MacKinnon $J$ test.
Finally, according to the tags, you may be just interested in R functions:
library(lmtest)
coxtest(fit1, fit2)
jtest(fit1, fit2)
Where fit1 and fit2 are two non-nested fitted linear regression models, coxtest is the Cox $LR$ test, and jtest Davidson-MacKinnon $J$ test.
jtestorcoxtestwith non-nested fits from Step1. Information criterion for non-nested will be a nice guide to what model is more statistically suitable (parsimonious), but for the hypothesis testing I would just go for any $LR$ (actually the log likelihood is a part of any info-criterion) tests myself. Conclusions will be somewhat close, but since there are two deterministically given penalty functions it is a bit tricky to compare them statistically. – Dmitrij Celov Mar 20 '11 at 15:38