[Rd] k means

Bill.Venables at csiro.au Bill.Venables at csiro.au
Tue May 13 01:12:46 CEST 2008


I would not support an extension of kmeans to do this.  I think it is
best left simple and fast as it now is.  

I can think of three ways you might handle your problem

1. Use, for example, pam() in the cluster package, which does a similar
job to kmeans (not quite the same, of course) with a general distance
measure.

2. If you are working with a non-standard metric and you really want to
use the k-means algorithm, then perhaps one way to do so is to use an
approximate euclidean coordinatisatin for the points with a
multidimensional scaling first and then use kmeans.  (e.g. cmdscale,
isoMDS, sammon, ...)  I've no idea what the traps are with this
approach, but it seems kind of feasible.

3. If the algorithms are there and available as you say, write the code
yourself and contribute it to the R-project as a simple package.
Everyone will benefit. 


Bill Venables
CSIRO Laboratories
PO Box 120, Cleveland, 4163
AUSTRALIA
Office Phone (email preferred): +61 7 3826 7251
Fax (if absolutely necessary):  +61 7 3826 7304
Mobile:                         +61 4 8819 4402
Home Phone:                     +61 7 3286 7700
mailto:Bill.Venables at csiro.au
http://www.cmis.csiro.au/bill.venables/ 

-----Original Message-----
From: r-devel-bounces at r-project.org
[mailto:r-devel-bounces at r-project.org] On Behalf Of
cgenolin at u-paris10.fr
Sent: Tuesday, 13 May 2008 3:25 AM
To: r-devel at r-project.org
Subject: [Rd] k means

Hi the devel list,

I am using K means with a non standard distance. As far as I see, the 
function kmeans is able to deal with 4 differents algorithm, but not 
with a user define distance.

In addition, kmeans is not able to deal with missing value whereas 
there is several solution that k-means can use to deal with them ; one 
is using a distance that takes the missing value in account, like a 
distance with Gower adjustement (which is the regular distance dist() 
used in R).

So is it possible to adapt kmeans to let the user gives an argument 
'distance to use'?

Christophe

______________________________________________
R-devel at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list