[R] "mvr" function
Bjørn-Helge Mevik
bhx2 at mevik.net
Fri Jun 3 09:55:13 CEST 2005
Jim BRINDLE writes:
> volumes <- read.table("THA_vol.txt", header = TRUE)
>
> and then created a data.frame called "vol". My response variable is
> in the last column of the "vol" data frame and my dependent variables
> are in columns 1 through 11.
[...]
> y <- vol[,12]
> X <- vol[,1:11]
> ans.pcr <- pcr(y ~ X,6,data=vol,validation="CV")
There are two problems here:
1) X is a data frame, not a matrix. This is what causes the error message.
2) You specify in the call that pcr should look in the data frame
`vol' for variables called 'y' and 'X'. (Presumably) they don't
exist there, but in the global environment (because of the
assignments `y <- vol[,12]', etc). (This will not lead to an
error, because pcr will find the variables anyway, but might lead
to confusion or errors if you later modify those variables.)
The first problem can be overcome by doing
X <- as.matrix(vol[,1:11])
and the second one by
ans.pcr <- pcr(y ~ X, 6, validation = "CV")
However, there are (as always in R :) several ways of accomplishing
the same thing. One solution is simply
ans.pcr <- pcr(V12 ~ ., 6, data = vol, validation = "CV")
(where V12 must be substituted with the name of the 12th variable of
vol; see names(vol)). This formula tells pcr to use V12 as the
response, and the remaining variable (in vol) as predictors.
A more general solution is to say
vol2 <- data.frame(y = vol[,12], X = I(as.matrix(vol[,1:11])))
ans.pcr <- pcr(y ~ X, 6, data = vol2, validation = "CV")
The I() makes R store X as a matrix in vol2, instead of as 11 separate
variables. This is handy for cases where you have several matrices.
The manual page for `lm' and the R manual `An Introduction to R'
(chapter 11) are good references for the formula handling in R.
--
HTH,
Bjørn-Helge Mevik
More information about the R-help
mailing list