[R] about pam
Martin Maechler
maechler at stat.math.ethz.ch
Thu Mar 16 21:42:33 CET 2006
>>>>> "Linda" == Linda Lei <llei at bccrc.ca>
>>>>> on Thu, 16 Mar 2006 11:11:29 -0800 writes:
Linda> Hi there,
Linda> In the description of command "pam", it mentions "For
Linda> datasets larger than (say) 200 observations". Now my
Linda> dataset is a "54732 by 5" dataframe named
Linda> "test". When I try to run pam(test,4),it shows "
Linda> cannot allocate vector of length 1497768547". Is it
Linda> because the row too big that it can't handle?
yes.
You must need clara() instead of pam().
Though with today's fast computers,
I'd advise to increase some of the defaults that made sense back
in the 80s: 'samples = 5' and 'sampsize' can easily be increased
by an order of magnitudes in order to get a more stable
(i.e. less random) result.
Also for your relatively large data matrix, you may want to use
'keep.data = FALSE'.
Regards,
Martin Maechler, ETH Zurich
More information about the R-help
mailing list