Hello, I would like to perform a multivariate regression analysis to model the relationship between m responses Y1, ... Ym and a single set of predictor variables X1, ..., Xr. Each response is assumed to follow its own regression model, and the error terms in each model can be correlated.
Based on my readings of the R help archives and R documentation, the function lm() should be able to perform this analysis. However my tests of lm() show that it really just fits m separate regression models (without accounting for any possible correlation between the response variables). I have attached an example below that demonstrates that the multivariate analysis result produced by lm() is identical to running separate regression analyses on each of the Y1, ... Ym response variables, which is not the desired objective. Can anyone confirm if lm() is indeed capable of performing multivariate regression analysis, and if so how? Thanks very much, Ernest PS my post is based on an earlier 2005 post where the same question was asked ... the conclusion at that time was that the function lm() is capable of handling the multivariate analysis. However my tests (shown below) indicate that lm() actually seems to perform separate regressions for each of the response variables without accounting for their possible correlation. ### multivariate analysis using lm() ### ######################################## > ex7.8 <- data.frame(z1 = c(0, 1, 2, 3, 4), y1 = c(1, 4, 3, 8, 9), y2 = c(-1, -1, 2, 3, 2)) > > f.mlm <- lm(cbind(y1, y2) ~ z1, data = ex7.8) > summary(f.mlm) Response y1 : Call: lm(formula = y1 ~ z1, data = ex7.8) Residuals: 1 2 3 4 5 0 1 -2 1 0 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.0000 1.0954 0.913 0.4286 z1 2.0000 0.4472 4.472 0.0208 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 1.414 on 3 degrees of freedom Multiple R-squared: 0.8696, Adjusted R-squared: 0.8261 F-statistic: 20 on 1 and 3 DF, p-value: 0.02084 Response y2 : Call: lm(formula = y2 ~ z1, data = ex7.8) Residuals: 1 2 3 4 5 -1.110e-16 -1.000e+00 1.000e+00 1.000e+00 -1.000e+00 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.0000 0.8944 -1.118 0.3450 z1 1.0000 0.3651 2.739 0.0714 . --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 1.155 on 3 degrees of freedom Multiple R-squared: 0.7143, Adjusted R-squared: 0.619 F-statistic: 7.5 on 1 and 3 DF, p-value: 0.07142 ### separate linear regressions on y1 and y2 ### ################################################ > f.mlm1 = lm(y1 ~ z1, data=ex7.8); summary(f.mlm1) Call: lm(formula = y1 ~ z1, data = ex7.8) Residuals: 1 2 3 4 5 0 1 -2 1 0 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.0000 1.0954 0.913 0.4286 z1 2.0000 0.4472 4.472 0.0208 * --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 1.414 on 3 degrees of freedom Multiple R-squared: 0.8696, Adjusted R-squared: 0.8261 F-statistic: 20 on 1 and 3 DF, p-value: 0.02084 > f.mlm2 = lm(y2 ~ z1, data=ex7.8); summary(f.mlm2) Call: lm(formula = y2 ~ z1, data = ex7.8) Residuals: 1 2 3 4 5 -1.110e-16 -1.000e+00 1.000e+00 1.000e+00 -1.000e+00 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.0000 0.8944 -1.118 0.3450 z1 1.0000 0.3651 2.739 0.0714 . --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Residual standard error: 1.155 on 3 degrees of freedom Multiple R-squared: 0.7143, Adjusted R-squared: 0.619 F-statistic: 7.5 on 1 and 3 DF, p-value: 0.07142 [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.