[BioC] Cluster analysis distance measuer

Jenny Bryan jenny at stat.ubc.ca
Fri Nov 12 19:03:35 CET 2004


> From: "Auer Michael" <michael.auer at meduniwien.ac.at>
> 
> I would like to know wheter there exists the possibility to cluster genes
> non-hierachically, but with the correlation as distance measure? K-means,
> clara, pam, etc, only seem to work with euclidean metrics. I aks the

Many clustering algorithms, pam for example, will accept a
dissimilarity object as input.  The limitation you perceive arises
only if you ask the pam function itself to compute the dissimilarity
for you.  Below is a tiny example of how to use a '1 minus
correlation' type of dissimilarity.

############################
library(cluster)
library(MASS)

Sigma.x <- matrix(0.7,nrow = 3, ncol = 3)
diag(Sigma.x) <- 1
x <- mvrnorm(n = 4, mu = c(3,5,3), Sigma = Sigma.x)

Sigma.y <- matrix(0.6, nrow = 3, ncol = 3)
diag(Sigma.y) <- 1
y <- mvrnorm(n = 4, mu = rep(1,3), Sigma = Sigma.y)

z <- rbind(x,y)
matplot(1:3,t(z), col = rep(c("red","green"),each=4),type = "l", lty = 1)

cor.dist.z <- as.dist(1 - abs(cor(t(z))))
pamfit <- pam(cor.dist.z, k = 2)
plot(pamfit)

-- 
Jenny Bryan
*----------------------------------*
* Assistant Professor              *
* Department of Statistics and     *
*   the Michael Smith Laboratories *
* University of British Columbia   *
*----------------------------------*
333-6356 Agricultural Road        
Vancouver, BC V6T 1Z2 Canada      
tel:   604.822.6422   
fax:   604.822.6960   
email: jenny at stat.ubc.ca



More information about the Bioconductor mailing list