[R] random variable selection algorithms for lda?

Thomas Schu th.schumann at gmx.de
Tue Aug 14 16:02:11 CEST 2012


Dear R-experts,

I would like to find the best variable combination which are maximises the
accuracy of a cross validated reclassification.
My data consists of 36 samples, equal distributed to 6 groups, and each
sample are characterised by 20 variables.

/data<-data.frame(1:36,1:20)
group<-(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,4,4,4,4,4,4,5,5,5,5,5,5,6,6,6,6,6,6)/

I tried to overcome the problem of variable selection with 
1. using functions of the "klaR"-package.
2. hand picking

On the one hand, I minimised wilks lambda, implemented the resulting
variables into the lda-function, on the other hand, I used stepclass()
forward and backward:

*BUT compared to my hand picked variable selections, both functions yielded
to less prediction!*
(One fact, that my hand picks were "better", is the swich of inner-group
accuracy by picking new variables which is not accounted in stepclass(). It
could happen, that the input of a new variable increases the accuracy of one
group and decreases the accuracy of an other group but the inplementation of
a second variable could increases the "decreased group" again without
negativ effects to other groups.)

1. What algortihms I could also use?

2. Does R offers an algorithm, which selects variables randomly?

3. Is there a function/algorithm for listing all possible variable
combinations?

Thank you,
I´m pleased about every suggestion.

Best regards
Thomas







--
View this message in context: http://r.789695.n4.nabble.com/random-variable-selection-algorithms-for-lda-tp4640267.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list