[R] cluster analysis with pairwise data

ilai keren at math.montana.edu
Wed Apr 4 18:59:37 CEST 2012


On Wed, Apr 4, 2012 at 10:12 AM, Petr Savicky <savicky at cs.cas.cz> wrote:
> On Wed, Apr 04, 2012 at 01:32:10PM +0200, paladini wrote:

>  Var1 <- c("(1,2)", "(7,8)", "(4,7)")
>  Var2 <- c("(1,5)", "(3,88)", "(12,4)")
>  Var3 <- c("(4,2)", "(6,5)", "(4,4)")
>  DF <- data.frame(Var1, Var2, Var3, stringsAsFactors=FALSE)
>
> If you want to use a distance between pairs depending on the
> numbers (and not only equal/different pair), then the data should
> to be transformed to a numeric format.

Or if the pairs have unique meaning ?daisy , also in the cluster
package, comes in handy (in this case you'll want to keep Vi as
factors in the call to DF).

Cheers

For example, as follows
>
>  trans <- function(x)
>  {
>      y <- strsplit(gsub("[()]", "", x), ",")
>      unname(t(vapply(y, FUN=as.numeric, FUN.VALUE=c(0, 0))))
>  }
>
>  DF <- data.frame(Var1=trans(Var1), Var2=trans(Var2), Var2=trans(Var3))
>  DF
>
>    Var1.1 Var1.2 Var2.1 Var2.2 Var2.1.1 Var2.2.1
>  1      1      2      1      5        4        2
>  2      7      8      3     88        6        5
>  3      4      7     12      4        4        4
>
> Then, see library(help=cluster).
>
> Hope this helps.
>
> Petr Savicky.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list