[R] Clustering question \ dist(datmat)

Liaw, Andy andy_liaw at merck.com
Mon Mar 27 07:27:05 CEST 2006


as.dist() does _not_ recompute the distances if given a matrix.  It simply
takes the lower triangular portion of the distance matrix given and attach
some attributes about the original dimension.  I don't think you need to
object to that.

Andy

From: kumar zaman
> 
> Dear Gabor and all ;
>    
>   I know this will work; but i already have a distance matrix 
> calculated using my distance measure Dij = 0.5 * ( 1 - 
> cos(theta_i - theta_j)), if i do hclust(as.dist(df)) then i 
> am taking distance another time for a matrix " df " which is 
> supposed to be a distance matrix, i hope i am clear ;
>    
>   ps: I just found out i can use " kmeans(df, 3, 
> iter.max=100)" it will take df as calculated by Dij. I still 
> need to use methods in hclust like " single, average, ward, 
> median, mcquitty, ...etc"
>    
>   Thank u anyway.
> 
> 
> Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
>   A distance matrix must be of class "dist". Try
> 
> hclust(as.dist(df))
> 
> 
> On 3/26/06, kumar zaman wrote:
> > Hello everybody. I am trying to cluster circular data (data points 
> > which are angles), thus i can not use the "dist" function 
> in "mclust" 
> > to generate my distance matrix, I am using the function " 
> Dij = 0.5*( 
> > 1 - cos(theta_i - theta_j)). The thing is "hclust" will not accept 
> > this distance matrix, i tried to put it in a data frame, 
> but again i 
> > get an error message saying " Error in if (n < 2) 
> stop("must have n >= 
> > 2 objects to cluster") : argument is of length zero". The distance 
> > matrix "dist" producing is a lower triangular one, mine is a square 
> > matrix, which i think does not matter. My question how to make 
> > "hclust" process my distance matrix, what i am doing wrong. 
> I am sure 
> > the problem is with the distance matrix format, Any suggestions are 
> > highly apprciated, the code below shows what i have done.
> >
> > clust1<- as.vector(rvm(5,5,15))
> > clust2<- as.vector(rvm(5,10,15))
> > clust3<- as.vector(rvm(5,15,15))
> > clust4<- as.vector(rvm(5,20,15))
> > clust5<- as.vector(rvm(5,25,15))
> > data1<- rbind(clust1,clust2,clust3,clust4,clust5)
> > datmat<- matrix(data1,nrow=25,ncol=1,byrow=TRUE)
> > circ.plot(datmat)
> > df<- array(dim=c(25,25))
> > for (i in 1:25){
> > for (j in 1:25){
> > df[i,j]<- 0.5*(1 - cos(datmat[i] - datmat[j]))
> > }
> > }
> > hcA<-hclust(df,method="average")
> > ****************************************************
> > Ahmed
> > Florida
> >
> >
> > ---------------------------------
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list 
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> >
> 
> 
> 
> Ahmed Albatineh,PhD
> Assistant Professor of Statistics
> Nova Southeastern University
> Fort Lauderdale, FL 33314
> U.S.A
> 		
> ---------------------------------
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>




More information about the R-help mailing list