[Rd] k means

friedrich.leisch at stat.uni-muenchen.de friedrich.leisch at stat.uni-muenchen.de
Tue May 13 13:11:29 CEST 2008


>>>>> On Mon, 12 May 2008 19:24:55 +0200,
>>>>> cgenolin  (c) wrote:

  > Hi the devel list,
  > I am using K means with a non standard distance. As far as I see, the 
  > function kmeans is able to deal with 4 differents algorithm, but not 
  > with a user define distance.

  > In addition, kmeans is not able to deal with missing value whereas 
  > there is several solution that k-means can use to deal with them ; one 
  > is using a distance that takes the missing value in account, like a 
  > distance with Gower adjustement (which is the regular distance dist() 
  > used in R).

  > So is it possible to adapt kmeans to let the user gives an argument 
  > 'distance to use'?

As Bill Venables already pointed out that makes not too much sense,
especially as there are already R functions for doing that. Package
flexclust implements a k-means-type clustering algorithm where the
user can provide arbitrary distance measures, have a look at

     http://www.stat.uni-muenchen.de/~leisch/papers/Leisch-2006.pdf

The code you need to write for using a new distance measure is
minimal, and there are two examples in the paper describing in detail
what needs to be done.

Hope this helps,
Fritz Leisch

-- 
-----------------------------------------------------------------------
Prof. Dr. Friedrich Leisch 

Institut für Statistik                          Tel: (+49 89) 2180 3165
Ludwig-Maximilians-Universität                  Fax: (+49 89) 2180 5308
Ludwigstraße 33
D-80539 München                     http://www.statistik.lmu.de/~leisch
-----------------------------------------------------------------------
   Journal Computational Statistics --- http://www.springer.com/180 
          Münchner R Kurse --- http://www.statistik.lmu.de/R



More information about the R-devel mailing list