[R] pls regression - optimal number of LVs
Dowkiw, Arnaud
Arnaud.Dowkiw at dpi.qld.gov.au
Thu Jul 24 05:02:02 CEST 2003
Dear R-helpers,
I have performed a PLS regression with the mvr function from the pls.pcr package an I have 2 questions :
1- do you know if mvr automatically centers the data ? It seems to me that it does so...
2- why in the situation below does the output say that the optimal number of latent variables is 4 ? In my humble opinion, it is 2 because the RMS increases and the R2 decreases when 3 LVs are considered :
> summary(maturityCondor.raw.mvr)
Data: X dimension: 8 1050
Y dimension: 8 1
Method: SIMPLS
Number of latent variables considered: 1-7
TRAINING:
RMS table:
[,1]
1 LV's 1.23e+01
2 LV's 6.79e+00
3 LV's 5.00e+00
4 LV's 2.17e+00
5 LV's 1.93e+00
6 LV's 7.79e-01
7 LV's 1.01e-09
Cumulative fraction of variance explained:
X Y
1 LV's 0.848 0.499
2 LV's 0.930 0.846
3 LV's 0.979 0.917
4 LV's 0.992 0.984
5 LV's 0.999 0.988
6 LV's 1.000 0.998
7 LV's 1.000 1.000
VALIDATION
Optimal number of latent variables: 4
RMS table (10-fold crossvalidation):
[,1]
1 LV's 16.21
2 LV's 12.15
3 LV's 13.81
4 LV's 6.68
5 LV's 6.38
6 LV's 5.91
7 LV's 13.38
Coefficient of multiple determination (R2):
[,1]
1 LV's 0.20
2 LV's 0.51
3 LV's 0.41
4 LV's 0.88
5 LV's 0.87
6 LV's 0.90
7 LV's 0.77
Thanks for your help,
Arnaud
*************************
Arnaud DOWKIW
Department of Primary Industries
J. Bjelke-Petersen Research Station
KINGAROY, QLD 4610
Australia
T : + 61 7 41 600 700
T : + 61 7 41 600 728 (direct)
F : + 61 7 41 600 760
**************************
********************************DISCLAIMER******************...{{dropped}}
More information about the R-help
mailing list