[R] passing known medoids to clara() in the cluster package

Dylan Beaudette dylan.beaudette at gmail.com
Mon Apr 10 23:25:27 CEST 2006

Thanks for the reply.

On Sunday 09 April 2006 11:46 pm, Martin Maechler wrote:
> >>>>> "DylanB" == Dylan Beaudette <dylan.beaudette at gmail.com>
> >>>>>     on Sun, 9 Apr 2006 19:28:44 -0700 writes:
>     DylanB> Greetings, I have had good success using the clara()
>     DylanB> function to perform a simple cluster analysis on a
>     DylanB> large dataset (1 million+ records with 9 variables).
>     DylanB> Since the clara function is a wrapper to pam(),
>     DylanB> which will accept known medoid data - I am wondering
>     DylanB> if this too is possible with clara() ... The
>     DylanB> documentation does not suggest that this is
>     DylanB> possible.
> indeed, it doesn't --  because it's not yet possible.
> I (as maintainer of "cluster") had added the ``known medoid''
> option to pam() a while ago last June (for  cluster version 1.10.0),
> and had left a note my TODO file to do the same for clara().

Ah. that would explain things ! :) . I will check back periodically to see 
when this feature is completed.

> Unfortunately it's not true that clara() was a wrapper to pam()
> as you state above.

I must have misread the manual pages...

> Given your wish and clear "use case" situation, I'm more
> motivated to approach this particular 'TODO' item!
> Martin Maechler, ETH Zurich
>     DylanB> Essentially I am trying to implement a "supervised
>     DylanB> classification" of numerous geographic data
>     DylanB> layers. The "unsupervised" approach using clara()
>     DylanB> works well, but I feel the output classes would be
>     DylanB> more meaningful if I were able to let clara() know
>     DylanB> about the classes that I have in mind.
>     DylanB> Is this at all feasible, or am I trying to
>     DylanB> accomplish something that is not possible?

Thanks Martin! 

I will give pam() a try, and see if it can handle the large dataset that I am 
currently using clara() for -- usually only about 5 seconds are required for 
clara() to complete.

Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis

More information about the R-help mailing list