[R] Help: Mahalanobis distances between 'Species' from iris
Jose Claudio Faria
joseclaudio.faria at terra.com.br
Mon Jul 11 02:47:44 CEST 2005
Dear Mulholland,
I'm sorry, I was not clearly and objective sufficiently.
Below you can see what I'm was trying to do:
# D2Mah by JCFaria and Gabor Grothiendieck: 10/7/05 20:46:41
D2Mah = function(y, x) {
stopifnot(is.data.frame(y), !missing(x))
stopifnot(dim(y)[1] != dim(x)[1])
y = as.matrix(y)
x = as.factor(x)
man = manova(y ~ x)
E = summary(man)$SS[2] #Matrix E
S = as.matrix(E$Residuals)/man$df.residual
InvS = solve(S)
mds = matrix(unlist(by(y, x, mean)), byrow=T, ncol=ncol(y))
f = function(a,b) mapply(function(a,b)
mahalanobis(mds[a,], mds[b,], InvS, TRUE), a, b)
seq. = seq(length = nrow(mds))
names(seq.) = levels(x)
D2 = outer(seq., seq., f)
}
#
# test
#
D2M = D2Mah(iris[,1:4], iris[,5])
print(D2M)
dend = hclust(as.dist(sqrt(D2M)), method='complete')
plot(dend)
So, Thanks for the reply.
Best,
JCFaria
Mulholland, Tom wrote:
> Why don't you use mahalanobis in the stats package. The function "Returns the Mahalanobis distance of all rows in 'x' and the vector mu='center' with respect to Sigma='cov'. This is (for vector 'x') defined as
>
> D^2 = (x - mu)' Sigma^{-1} (x - mu)
>
> I don't have any idea what this does but it appears to be talking about the same topic. If it is not suitable package fpc has related functions and adehabitat has a function for "Habitat Suitability Mapping with Mahalanobis Distances"
>
> Tom
>
>
>>-----Original Message-----
>>From: r-help-bounces at stat.math.ethz.ch
>>[mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Jose
>>Claudio Faria
>>Sent: Thursday, 7 July 2005 5:29 AM
>>To: r-help at stat.math.ethz.ch
>>Subject: [R] Help: Mahalanobis distances between 'Species' from iris
>>
>>
>>Dear R list,
>>
>>I'm trying to calculate Mahalanobis distances for 'Species'
>>of 'iris' data
>>as obtained below:
>>
>>Squared Distance to Species From Species:
>>
>> Setosa Versicolor Virginica
>>Setosa 0 89.86419 179.38471
>>Versicolor 89.86419 0 17.20107
>>Virginica 179.38471 17.20107 0
>>
>>This distances above were obtained with proc 'CANDISC' of SAS, please,
>>see Output 21.1.2: Iris Data: Squared Mahalanobis Distances from
>>http://www.id.unizh.ch/software/unix/statmath/sas/sasdoc/stat/
>>chap21/sect19.htm
>>
>> From this distance my intention is to make a cluster
>>analysis as below, using
>>the package 'mclust':
>>
>>#
>># --- Begin R script ---
>>#
>># For units compatibility of 'iris' from R dataset and 'iris'
>>data used in
>># the SAS example:
>>Measures = iris[,1:4]*10
>>Species = iris[,5]
>>irisSAS = data.frame(Measures, Species)
>>
>>n = 3
>>Mah = c( 0,
>> 89.86419, 0,
>> 179.38471, 17.20107, 0)
>>
>># My Question is: how to obtain 'Mah' with R from 'irisSAS' data?
>>
>>D = matrix(0, n, n)
>>
>>nam = c('Set', 'Ver', 'Vir')
>>rownames(D) = nam
>>colnames(D) = nam
>>
>>k = 0
>>for (i in 1:n) {
>> for (j in 1:i) {
>> k = k+1
>> D[i,j] = Mah[k]
>> D[j,i] = Mah[k]
>> }
>>}
>>
>>D=sqrt(D) #D2 -> D
>>
>>library(mclust)
>>dendroS = hclust(as.dist(D), method='single')
>>dendroC = hclust(as.dist(D), method='complete')
>>
>>win.graph(w = 3.5, h = 6)
>>split.screen(c(2, 1))
>>screen(1)
>>plot(dendroS, main='Single', sub='', xlab='', ylab='', col='blue')
>>
>>screen(2)
>>plot(dendroC, main='Complete', sub='', xlab='', col='red')
>>#
>># --- End R script ---
>>#
>>
>>I always need of this type of analysis and I'm not founding
>>how to make it in
>>the CRAN documentation (Archives, packages: mclust, cluster,
>>fpc and mva).
>>
>>Regards,
>>--
>>Jose Claudio Faria
>>Brasil/Bahia/UESC/DCET
>>Estatistica Experimental/Prof. Adjunto
>>mails:
>> joseclaudio.faria at terra.com.br
>>
>>______________________________________________
>>R-help at stat.math.ethz.ch mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide!
>>http://www.R-project.org/posting-guide.html
>>
More information about the R-help
mailing list