# [R] Multivariate multiple regression

Henric Nilsson henric.nilsson at statisticon.se
Thu May 5 09:01:52 CEST 2005

```Iain Pardoe said the following on 2005-05-04 20:08:

> I'd like to model the relationship between m responses Y1, ..., Ym and a
> single set of predictor variables X1, ..., Xr.  Each response is assumed
> to follow its own regression model, and the error terms in each model
> can be correlated.  My understanding is that although lm() handles
> vector Y's on the left-hand side of the model formula, it really just
> fits m separate lm models.  What should I use to do a full multivariate
> analysis (as in section 7.7 of Johnson/Wichern)?  Thanks.

Have you tried using the `lm' function? Note that R 2.1.0 added several
useful functions for multivariate responses.

Let's replicate Johnson & Wichern's Example 7.8 (p. 413, in the 4th Ed.)
using `lm':

> ex7.8 <- data.frame(z1 = c(0, 1, 2, 3, 4),
+                     y1 = c(1, 4, 3, 8, 9),
+                     y2 = c(-1, -1, 2, 3, 2))
>
> f.mlm <- lm(cbind(y1, y2) ~ z1, data = ex7.8)
> summary(f.mlm)
Response y1 :

Call:
lm(formula = y1 ~ z1, data = ex7.8)

Residuals:
1          2          3          4          5
6.880e-17  1.000e+00 -2.000e+00  1.000e+00 -1.326e-16

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)   1.0000     1.0954   0.913   0.4286
z1            2.0000     0.4472   4.472   0.0208 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.414 on 3 degrees of freedom
Multiple R-Squared: 0.8696,     Adjusted R-squared: 0.8261
F-statistic:    20 on 1 and 3 DF,  p-value: 0.02084

Response y2 :

Call:
lm(formula = y2 ~ z1, data = ex7.8)

Residuals:
1          2          3          4          5
9.931e-17 -1.000e+00  1.000e+00  1.000e+00 -1.000e+00

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  -1.0000     0.8944  -1.118   0.3450
z1            1.0000     0.3651   2.739   0.0714 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.155 on 3 degrees of freedom
Multiple R-Squared: 0.7143,     Adjusted R-squared: 0.619
F-statistic:   7.5 on 1 and 3 DF,  p-value: 0.07142

> SSD(f.mlm)
\$SSD
y1 y2
y1  6 -2
y2 -2  4

\$call
lm(formula = cbind(y1, y2) ~ z1, data = ex7.8)

\$df
[1] 3

attr(,"class")
[1] "SSD"
> f.mlm1 <- update(f.mlm, ~ 1)
> anova(f.mlm, f.mlm1)
Analysis of Variance Table

Model 1: cbind(y1, y2) ~ z1
Model 2: cbind(y1, y2) ~ 1
Res.Df Df Gen.var.  Pillai approx F num Df den Df Pr(>F)
1      3      1.4907
2      4  1   4.4721  0.9375  15.0000      2      2 0.0625 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

HTH,
Henric

```

More information about the R-help mailing list