5

Very simple. I am looking for a package that does Multivariate Linear Regression with weights on the observations. Does anyone know of a package that does this? I am shocked that I have not been able to find any.

NOTE: R does NOT do multivariate regression. The lm() help page specifically states: "If response is a matrix a linear model is fitted separately by least-squares to each column of the matrix. " This means independent regression models for each response variable. Thus lm() does NOT do multivariate linear regression. It merely does several univariate linear regressions for convenience.

cmo
  • 334
  • 3
  • 13
  • 1
    Although it is correct that lm() does not handle weighted multivariate regression, it does do unweighted multivariate regression properly. Fitting a least-squares estimate separately to each column of the response matrix provides the correct coefficient estimates. The "mlm" objects returned by lm() for models with response matrices contain the information needed for true multivariate inference. See Fox and Weinberg, and my further comments on an answer below. – EdM Aug 11 '20 at 12:19

3 Answers3

2

Try package MRCE in R. This is for "Multivariate regression with covariance estimation".

Stat
  • 7,486
  • 1
    Good suggestion! Doesn't seem to include observation weights, though. The weights ("lambdas" in the documentation), are for the independent variables... – cmo Mar 15 '13 at 20:05
  • The package suggested in this answer is now archived. – EdM Aug 11 '20 at 18:31
1

Case weights in a multivariate (multiple-outcome) regression don't have the straightforward meaning that they have in weighted least squares with a single outcome variable. Then each weight ideally represents the inverse of the variance of the corresponding outcome value, with error variances independent among cases. In a multivariate regression such an interpretation of a case weight would implicitly assume that all outcomes had the same relative variances from case to case. Also, a major reason for multivariate regression is to estimate the covariances among outcome values.

A work-around would be to take advantage of how, with a single outcome, a data transformation followed by OLS provides the same regression coefficients as weighted least squares. If you pre-multiply each of the design matrix and the outcome vector by the diagonal matrix of the square roots of the case weights, then OLS gives the same result as weighted least squares. As the regression coefficients returned by multivariate regressions are the same as those produced by regressions with each of the outcome variables individually, just extend that to pre-multiplying the outcome matrix--if you are willing to accept the consequences of any inapplicability of case weights to a multivariate regression. Transform the data first, then do the mulitivariate regression.

Despite the fear raised by the OP, lm() handles unweighted multivariate regressions quite well. It produces "mlm" objects that contain all the information needed for standard multivariate inference. See Fox and Weisberg. The R stats package simply (and I expect for reasons noted above) refuses to process a weighted multivariate regression beyond the estimation of the coefficients.

jay.sf
  • 725
EdM
  • 92,183
  • 10
  • 92
  • 267
0

This is an old post, but the OP is factually wrong in claiming R doesn't do multi-variate regression.

The documentation states "If response is a matrix a linear model is fitted separately by least-squares to each column of the matrix." The key thing here is RESPONSE is a matrix. That is the Y is a matrix, then R fits ncol(Y) separate models to the same X: Y(i) ~ X.

Chris
  • 625
  • 4
    I'm glad you're not downvoting, because the misunderstanding might be on your part: while multiple regression deals with multiple independent variables, multivariate regression concerns a vector-valued dependent (response) variable. It is not the same as a set of independently fitted multiple regressions to the components of the response. – whuber Nov 11 '13 at 21:16
  • This answer is correct, so far as it goes. That lm()in R estimates regression coefficients response-by-response does not matter; as the coefficient estimates are the same. For _unweighted_ multivariate regressions, which produce "mlm" objects, the standard methods for true multivariate inference are available. "Residual sums of squares and products" are provided bySSD(),estVar()provides the matrix of correlated residuals, andvcov()` provides the appropriate Kronecker product for the coefficient-estimate covariances. It does not, however, handle weighted multivariate regressions. – EdM Aug 11 '20 at 12:12
  • @EdM In my research on inference for MLM, I came across your comment. I indeed want to do inference with weighted MLM and noticed this problem that R cannot do that. I even asked a question on that. Is the effort involved too great, or is it simply not possible? Do you perhaps know of a way to determine the vcov of an MLM in R? I need it to calculate HC0 standard errors of the MLM. – jay.sf Jul 02 '23 at 10:10
  • @jay.sf my answer explains why I think that weighted multivariate analyses aren't supported: as case weights are typically based on an estimate of residual variance, having multiple outcomes would tend to mean different case weights for each outcome. If you have another reliable weighting method, my answer also includes a suggestion for a data transformation that could accomplish what you want. – EdM Jul 02 '23 at 13:52
  • @jay.sf I've now illustrated the suggested data transformation as an answer to your question. – EdM Jul 02 '23 at 14:45