[R] LDA with previous PCA for dimensionality reduction
David Enot
dle at aber.ac.uk
Wed Nov 24 16:18:14 CET 2004
On 24 Nov 2004, at 10:16, Christoph Lehmann wrote:
> Dear all, not really a R question but:
>
> If I want to check for the classification accuracy of a LDA with
> previous PCA for dimensionality reduction by means of the LOOCV
> method:
>
> Is it ok to do the PCA on the WHOLE dataset ONCE and then run the LDA
> with the CV option set to TRUE (runs LOOCV)
>
> -- OR--
>
> do I need
> - to compute for each 'test-bag' (the n-1 observations) a PCA
> (my.princomp.1),
> - then run the LDA on the test-bag scores (-> my.lda.1)
> - then compute the scores of the left-out-observation using
> my.princomp.1 (-> my.scores.2)
> - and only then use predict.lda(my.lda.1, my.scores.2) on the scores
> of the left-out-observation
>
> ?
> I read some articles, where they choose procedure 1, but I am not
> sure, if this is really correct?
As far as understand your problem (assessing the predictive ability of
your model), the second solution should be done: the test set is
something that should be never seen by the training data. If you run
your PCA on the whole set, then you will take into account your test
bag while forming your training data. Keep in mind that your classifier
is made up with 2 components: PCA followed by LDA. This is fine if you
build your model with a given number of PC's: the procedure to get an
optimal number of PC's would be similar as above but considering the
(n-1) examples. A proper validation of the model can become quickly
tricky: this requires a bit of computing skills and this may take
longer (especially with LOO)!
Hope it helps
David
>
> many thanks for a hint
>
> Christoph
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
More information about the R-help
mailing list