[R] cluster analysis with pairwise data
David L Carlson
dcarlson at tamu.edu
Wed Apr 4 17:59:44 CEST 2012
You can create distance matrices for each Variable, square them, sum them,
and take the square root. As for getting the data into a data frame, the
simplest would be to enter the three variables into six columns like the
following:
data
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 2 1 5 4 2
[2,] 7 8 3 88 6 5
[3,] 4 7 12 4 4 4
Then use dist() on each pair of columns:
1:2, 3:4, 5:6 . . .
e.g. for the 3 rows of data you provided
size <- nrow(data)*(nrow(data)-1)/2
dm <- dist(rep(0, size))
for(i in seq(1, 6, 2)) {
dm <- dm + dist(data[,i:(i+1)])^2
}
dm <- sqrt(dm)
dm
----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of paladini
Sent: Wednesday, April 04, 2012 6:32 AM
To: r-help at r-project.org
Subject: [R] cluster analysis with pairwise data
Hello,
I want to do a cluster analysis with my data. The problem is, that the
variables dont't consist of single value but the entries are pairs of
values.
That lokks like this:
Variable 1: Variable2: Variable3: . . .
(1,2) (1,5) (4,2)
(7,8) (3,88) (6,5)
(4,7) (12,4) (4,4)
. . .
. . .
. . .
Is it possible to perform a cluster-analysis with this kind of data in
R ?
I dont even know how to get this data in a matrix or a dada-frame or
anything like this.
It would be really nice if somebody could help me.
Best regards and happy Easter
Claudia
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list