[R] klaR stepclass

Uwe Ligges ligges at statistik.uni-dortmund.de
Tue Jun 5 08:54:58 CEST 2007



Neil Losin wrote:
> Hi,
> 
> I'm trying to use "stepclass" to do a stepwise variable selection with
> method=lda. I keep getting this warning message, which shows up once
> for each variable added to the model during variable selection:
> 
> Warning message:
> error(s) in modeling/prediction step in: cv.rate(vars = c(model,
> tryvar), data = data, grouping = grouping,

This means that in some steps of the stepwise procedure, one of the 
functions for estimating the parameters or predicting (in this case lda 
or predict.lda)  generates an error. This typically happens if you 
cannot invert the covariance matrix in one of the intermediately tried 
models due to singularity (examples for such a case are collinear 
variables or an unluckily chosen partition of the data for 
cross-validation).


> I don't know how to interpret this warning. I do not have a separate
> data set for cross validation. Is this important, or will "stepclass"
> do leave-one-out cross-validation in the absence of other
> cross-validation data?

By default, it uses 10-fold cross-validation which can be changed to 
leave-one-out by specifying the argument
   fold = nrow(dataframe)

> Also, when I run "stepclass" several times with identical parameters,
> it will give me a slightly different "best" model each time. It seems
> to me that it should always return the same model, as long as the
> dataset and call are the same. Am I missing something? Is there some
> randomization going on behind the scenes that I'm unaware of?

Yes, 10-fold cv is performed with randomly chosen partitions of the 
data. You can either run leave-one-out cv (which is rather time 
consuming) or use set.seed() in order to get reproducible results.

Uwe Ligges


> Thanks in advance,
> Neil Losin
>



More information about the R-help mailing list