[R] randomForest

Weiwei Shi helprhelp at gmail.com
Mon Jul 11 16:41:41 CEST 2005


Thanks.
Many people pointed that out. (It was due to that I only knew lappy by
that time :).


On 7/11/05, Martin Maechler <maechler at stat.math.ethz.ch> wrote:
> >>>>> "Duncan" == Duncan Murdoch <murdoch at stats.uwo.ca>
> >>>>>     on Thu, 07 Jul 2005 15:44:38 -0400 writes:
> 
>     Duncan> On 7/7/2005 3:38 PM, Weiwei Shi wrote:
>     >> Hi there:
>     >> I have a question on random foresst:
>     >>
>     >> recently i helped a friend with her random forest and i came with this problem:
>     >> her dataset has 6 classes and since the sample size is pretty small:
>     >> 264 and the class distr is like this (Diag is the response variable)
> 
>     >> sample.size <- lapply(1:6, function(i) sum(Diag==i))
>     >>> sample.size
>     >> [[1]]
>     >> [1] 36
> 
>     ....
> 
> and later you get problems because you didn't know that a *list*
> such as 'sample.size' should be made into a so called
> *atomic vector* {and there's a function  is.atomic(.) ! to test for it}
> and Duncan and others told you about unlist().
> 
> Now there are two things I'd want to add:
> 
> 1) If you had used
> 
>       s.size <- table(Diag)
> 
>    you had used a faster and simpler expression with the same result.
>    Though in general (when there can be zero counts), to give the
>    same result, you'd need
> 
>       s.size <- table(factor(Diag, levels = 1:6))
> 
>    Still a bit preferable to the lapply(.) IMO
> 
> 
> 2)  You should get into the habit of using
>      sapply(.)   rather than  lapply(.).
> 
>     sapply() originally was exactly devised for the above task:
>     and stands for ``[s]implified lapply''.
> 
>     It always returns an ``unlisted'' result when appropriate.
> 
> Regards,
> Martin Maechler, ETH Zurich
> 
> 


-- 
Weiwei Shi, Ph.D

"Did you always know?"
"No, I did not. But I believed..."
---Matrix III




More information about the R-help mailing list