[R] randomForest
Martin Maechler
maechler at stat.math.ethz.ch
Mon Jul 11 08:48:36 CEST 2005
>>>>> "Duncan" == Duncan Murdoch <murdoch at stats.uwo.ca>
>>>>> on Thu, 07 Jul 2005 15:44:38 -0400 writes:
Duncan> On 7/7/2005 3:38 PM, Weiwei Shi wrote:
>> Hi there:
>> I have a question on random foresst:
>>
>> recently i helped a friend with her random forest and i came with this problem:
>> her dataset has 6 classes and since the sample size is pretty small:
>> 264 and the class distr is like this (Diag is the response variable)
>> sample.size <- lapply(1:6, function(i) sum(Diag==i))
>>> sample.size
>> [[1]]
>> [1] 36
....
and later you get problems because you didn't know that a *list*
such as 'sample.size' should be made into a so called
*atomic vector* {and there's a function is.atomic(.) ! to test for it}
and Duncan and others told you about unlist().
Now there are two things I'd want to add:
1) If you had used
s.size <- table(Diag)
you had used a faster and simpler expression with the same result.
Though in general (when there can be zero counts), to give the
same result, you'd need
s.size <- table(factor(Diag, levels = 1:6))
Still a bit preferable to the lapply(.) IMO
2) You should get into the habit of using
sapply(.) rather than lapply(.).
sapply() originally was exactly devised for the above task:
and stands for ``[s]implified lapply''.
It always returns an ``unlisted'' result when appropriate.
Regards,
Martin Maechler, ETH Zurich
More information about the R-help
mailing list