[R] Re: pls regression - optimal number of LVs

Ron Wehrens rwehrens at sci.kun.nl
Thu Jul 24 09:59:34 CEST 2003


On Thursday 24 July 2003 05:02, Dowkiw, Arnaud wrote:
> Dear R-helpers,
>
> I have performed a PLS regression with the mvr function from the pls.pcr
> package an I have 2 questions : 1- do you know if mvr automatically centers
> the data ? It seems to me that it does so... 

Yup, it does... common practice.

> 2- why in  the situation below
> does the output say that the optimal number of latent variables is 4 ? In
> my humble opinion, it is 2 because the RMS increases and the R2 decreases
> when 3 LVs are considered :

Many criteria exist and for some data sets they agree, for most they do not. 
The criterion applied here checks whether the decrease in cross-validated 
error is significant; Hastie et al. use it in their book "The elements of 
statistical learning". It is described in the man page, and like all 
criteria, it is not guaranteed to satisfy all users. If you feel better using 
2LVs, you can do that. 

Ron

-- 
Ron Wehrens            
Dept. of Chemometrics  
University of Nijmegen	Email: rwehrens at sci.kun.nl
Toernooiveld 1		http://www-cac.sci.kun.nl/cac/
6525 ED Nijmegen	Tel: +31 24 365 2053
The Netherlands		Fax: +31 24 365 2653




More information about the R-help mailing list