[R] cluster analysis with pairwise data

David L Carlson dcarlson at tamu.edu
Wed Apr 4 17:59:44 CEST 2012


You can create distance matrices for each Variable, square them, sum them,
and take the square root. As for getting the data into a data frame, the
simplest would be to enter the three variables into six columns like the
following:

data
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]    1    2    1    5    4    2
[2,]    7    8    3   88    6    5
[3,]    4    7   12    4    4    4

Then use dist() on each pair of columns:

1:2, 3:4, 5:6 . . .

e.g. for the 3 rows of data you provided

size <- nrow(data)*(nrow(data)-1)/2
dm <- dist(rep(0, size))
for(i in seq(1, 6, 2)) {
  dm <- dm + dist(data[,i:(i+1)])^2
}
dm <- sqrt(dm)
dm

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352



-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of paladini
Sent: Wednesday, April 04, 2012 6:32 AM
To: r-help at r-project.org
Subject: [R] cluster analysis with pairwise data

Hello,
I want to do a cluster analysis with my data. The problem is, that the 
variables dont't consist of single value but the entries are pairs of 
values.
That lokks like this:


Variable 1:    Variable2:      Variable3:  .    .    .
(1,2)          (1,5)           (4,2)
(7,8)          (3,88)          (6,5)
(4,7)          (12,4)          (4,4)
.               .              .
.               .              .
.               .              .
Is it possible to perform a cluster-analysis with this kind of data in 
R ?
I dont even know how to get this data in a matrix or a dada-frame or 
anything like this.

It would be really nice if somebody could help me.

Best regards and happy Easter

Claudia

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list