[R] Pre-model Variable Reduction
Harsh
singhalblr at gmail.com
Tue Dec 9 11:34:01 CET 2008
Hello All,
I am trying to carry out variable reduction. I do not have information
about the dependent variable, and have only the X variables as it
were.
In selecting variables I wish to keep, I have considered the following criteria.
1) Percentage of missing value in each column/variable
2) Variance of each variable, with a cut-off value.
I recently came across Weka and found that there is an RWeka package
which would allow me to make use of Weka through R.
Weka provides a "Genetic search" variable reduction method, but I
could not find its R code implementation in the RWeka Pdf file on
CRAN.
I looked for other R packages that allow me to do variable reduction
without considering a dependent variable. I came across 'dprep'
package but it does not have a Windows implementation.
Moreover, I have a dataset that contains continuous and categorical
variables, some categorical variables having 3 levels, 10 levels and
so on, till a max 50 levels (E.g. States in the USA).
Any suggestions in this regard will be much appreciated.
Thank you
Harsh Singhal
Decision Systems,
Mu Sigma, Inc.
More information about the R-help
mailing list