[R] Scaling in predict.prcomp
Gad Abraham
gabraham at csse.unimelb.edu.au
Mon Apr 21 02:27:32 CEST 2008
Prof Brian Ripley wrote:
> On Sun, 20 Apr 2008, Gad Abraham wrote:
>
>> Hi,
>>
>> Say x.train is a matrix of covariates that I want to do PCA on, so I can
>> do regression on its principal components, and x.test is a test set of
>> the same covariates on which I want to evaluate the regression fit. I
>> would like the covariates to be centred and scaled:
>>
>> p <- prcomp(x.train, center=TRUE, scale=TRUE)
>> x.train.pc <- predict(p)
>>
>> Now I want to get the PCs from the test set.
>
> The way to do that is to call prcomp() on the test set.
>
> If you want to project new data onto the PCs of the training set (as a
> set of axes in the data space), you just use predict(p, newdata=).
>
>> Should I use the same center and scale vectors from the training set:
>>
>> x.test.pc <- predict(p, newdata=x.test, center=p$center, scale=p$center)
>>
>> or use the training set's own centers and scales:
>>
>> x.test.pc <- predict(p, newdata=x.test, center=TRUE, scale=TRUE)
>
> I see no evidence that those additional arguments are used.
>
> predict.prcomp uses the origin of the training set's PCs, since it is
> that coordinate system which you are projecting onto.
>
I should've have looked more carefully, now I see that in the code for
predict.prcomp the test data will indeed get centred and scaled
according to the training data's vectors:
getAnywhere(predict.prcomp)
...
scale(newdata, object$center, object$scale) %*% object$rotation
Thanks,
Gad
--
Gad Abraham
Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
email: gabraham at csse.unimelb.edu.au
web: http://www.csse.unimelb.edu.au/~gabraham
More information about the R-help
mailing list