[R] random variable selection algorithms for lda?
Thomas Schu
th.schumann at gmx.de
Tue Aug 14 16:02:11 CEST 2012
Dear R-experts,
I would like to find the best variable combination which are maximises the
accuracy of a cross validated reclassification.
My data consists of 36 samples, equal distributed to 6 groups, and each
sample are characterised by 20 variables.
/data<-data.frame(1:36,1:20)
group<-(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6)/
I tried to overcome the problem of variable selection with
1. using functions of the "klaR"-package.
2. hand picking
On the one hand, I minimised wilks lambda, implemented the resulting
variables into the lda-function, on the other hand, I used stepclass()
forward and backward:
*BUT compared to my hand picked variable selections, both functions yielded
to less prediction!*
(One fact, that my hand picks were "better", is the swich of inner-group
accuracy by picking new variables which is not accounted in stepclass(). It
could happen, that the input of a new variable increases the accuracy of one
group and decreases the accuracy of an other group but the inplementation of
a second variable could increases the "decreased group" again without
negativ effects to other groups.)
1. What algortihms I could also use?
2. Does R offers an algorithm, which selects variables randomly?
3. Is there a function/algorithm for listing all possible variable
combinations?
Thank you,
I´m pleased about every suggestion.
Best regards
Thomas
--
View this message in context: http://r.789695.n4.nabble.com/random-variable-selection-algorithms-for-lda-tp4640267.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list