# [R] Help: Mahalanobis distances between 'Species' from iris

Jose Claudio Faria joseclaudio.faria at terra.com.br
Mon Jul 11 02:47:44 CEST 2005

```Dear Mulholland,

I'm sorry, I was not clearly and objective sufficiently.
Below you can see what I'm was trying to do:

# D2Mah by JCFaria and Gabor Grothiendieck: 10/7/05 20:46:41
D2Mah = function(y, x) {

stopifnot(is.data.frame(y), !missing(x))
stopifnot(dim(y) != dim(x))
y    = as.matrix(y)
x    = as.factor(x)
man  = manova(y ~ x)
E    = summary(man)\$SS #Matrix E
S    = as.matrix(E\$Residuals)/man\$df.residual
InvS = solve(S)
mds  = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))

f = function(a,b) mapply(function(a,b)
mahalanobis(mds[a,], mds[b,], InvS, TRUE), a, b)
seq. = seq(length = nrow(mds))
names(seq.) = levels(x)
D2 = outer(seq., seq., f)
}

#
# test
#
D2M = D2Mah(iris[,1:4], iris[,5])
print(D2M)
dend = hclust(as.dist(sqrt(D2M)), method='complete')
plot(dend)

Best,

JCFaria

Mulholland, Tom wrote:
> Why don't you use mahalanobis in the stats package. The function "Returns the Mahalanobis distance of all rows in 'x' and the vector mu='center' with respect to Sigma='cov'. This is (for vector 'x') defined as
>
>                  D^2 = (x - mu)' Sigma^{-1} (x - mu)
>
> I don't have any idea what this does but it appears to be talking about the same topic. If it is not suitable package fpc has related functions and adehabitat has a function for "Habitat Suitability Mapping with Mahalanobis Distances"
>
> Tom
>
>
>>-----Original Message-----
>>From: r-help-bounces at stat.math.ethz.ch
>>[mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Jose
>>Claudio Faria
>>Sent: Thursday, 7 July 2005 5:29 AM
>>To: r-help at stat.math.ethz.ch
>>Subject: [R] Help: Mahalanobis distances between 'Species' from iris
>>
>>
>>Dear R list,
>>
>>I'm trying to calculate Mahalanobis distances for 'Species'
>>of 'iris' data
>>as obtained below:
>>
>>Squared Distance to Species From Species:
>>
>>               Setosa Versicolor Virginica
>>Setosa 	           0   89.86419 179.38471
>>Versicolor  89.86419          0  17.20107
>>Virginica  179.38471   17.20107         0
>>
>>This distances above were obtained with proc 'CANDISC' of SAS, please,
>>see Output 21.1.2: Iris Data: Squared Mahalanobis Distances from
>>http://www.id.unizh.ch/software/unix/statmath/sas/sasdoc/stat/
>>chap21/sect19.htm
>>
>> From this distance my intention is to make a cluster
>>analysis as below, using
>>the package 'mclust':
>>
>>#
>># --- Begin R script ---
>>#
>># For units compatibility of 'iris' from R dataset and 'iris'
>>data used in
>># the SAS example:
>>Measures = iris[,1:4]*10
>>Species  = iris[,5]
>>irisSAS  = data.frame(Measures, Species)
>>
>>n   = 3
>>Mah = c(        0,
>>          89.86419,        0,
>>         179.38471, 17.20107, 0)
>>
>># My Question is: how to obtain 'Mah' with R from 'irisSAS' data?
>>
>>D = matrix(0, n, n)
>>
>>nam = c('Set', 'Ver', 'Vir')
>>rownames(D) = nam
>>colnames(D) = nam
>>
>>k = 0
>>for (i in 1:n) {
>>    for (j in 1:i) {
>>       k      = k+1
>>       D[i,j] = Mah[k]
>>       D[j,i] = Mah[k]
>>    }
>>}
>>
>>D=sqrt(D) #D2 -> D
>>
>>library(mclust)
>>dendroS = hclust(as.dist(D), method='single')
>>dendroC = hclust(as.dist(D), method='complete')
>>
>>win.graph(w = 3.5, h = 6)
>>split.screen(c(2, 1))
>>screen(1)
>>plot(dendroS, main='Single', sub='', xlab='', ylab='', col='blue')
>>
>>screen(2)
>>plot(dendroC, main='Complete', sub='', xlab='', col='red')
>>#
>># --- End R script ---
>>#
>>
>>I always need of this type of analysis and I'm not founding
>>how to make it in
>>the CRAN documentation (Archives, packages: mclust, cluster,
>>fpc and mva).
>>
>>Regards,
>>--
>>Jose Claudio Faria
>>Brasil/Bahia/UESC/DCET
>>mails:
>>  joseclaudio.faria at terra.com.br
>>
>>______________________________________________
>>R-help at stat.math.ethz.ch mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help