[R] about pam

Martin Maechler maechler at stat.math.ethz.ch
Thu Mar 16 21:42:33 CET 2006


>>>>> "Linda" == Linda Lei <llei at bccrc.ca>
>>>>>     on Thu, 16 Mar 2006 11:11:29 -0800 writes:

    Linda> Hi there,

    Linda> In the description of command "pam", it mentions "For
    Linda> datasets larger than (say) 200 observations". Now my
    Linda> dataset is a "54732 by 5" dataframe named
    Linda> "test". When I try to run pam(test,4),it shows "
    Linda> cannot allocate vector of length 1497768547". Is it
    Linda> because the row too big that it can't handle?
yes.
You must need clara() instead of pam().

Though with today's fast computers, 
I'd advise to increase some of the defaults that made sense back
in the 80s: 'samples = 5' and 'sampsize' can easily be increased
by an order of magnitudes in order to get a more stable
(i.e. less random) result.
Also for your relatively large data matrix, you may want to use
'keep.data = FALSE'.

Regards,
Martin Maechler, ETH Zurich




More information about the R-help mailing list