[R] cluster analysis with pairwise data
ilai
keren at math.montana.edu
Wed Apr 4 18:59:37 CEST 2012
On Wed, Apr 4, 2012 at 10:12 AM, Petr Savicky <savicky at cs.cas.cz> wrote:
> On Wed, Apr 04, 2012 at 01:32:10PM +0200, paladini wrote:
> Var1 <- c("(1,2)", "(7,8)", "(4,7)")
> Var2 <- c("(1,5)", "(3,88)", "(12,4)")
> Var3 <- c("(4,2)", "(6,5)", "(4,4)")
> DF <- data.frame(Var1, Var2, Var3, stringsAsFactors=FALSE)
>
> If you want to use a distance between pairs depending on the
> numbers (and not only equal/different pair), then the data should
> to be transformed to a numeric format.
Or if the pairs have unique meaning ?daisy , also in the cluster
package, comes in handy (in this case you'll want to keep Vi as
factors in the call to DF).
Cheers
For example, as follows
>
> trans <- function(x)
> {
> y <- strsplit(gsub("[()]", "", x), ",")
> unname(t(vapply(y, FUN=as.numeric, FUN.VALUE=c(0, 0))))
> }
>
> DF <- data.frame(Var1=trans(Var1), Var2=trans(Var2), Var2=trans(Var3))
> DF
>
> Var1.1 Var1.2 Var2.1 Var2.2 Var2.1.1 Var2.2.1
> 1 1 2 1 5 4 2
> 2 7 8 3 88 6 5
> 3 4 7 12 4 4 4
>
> Then, see library(help=cluster).
>
> Hope this helps.
>
> Petr Savicky.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list