[R] Minimum Spanning Tree

Gábor Csárdi csardi at rmki.kfki.hu
Tue Apr 7 20:35:16 CEST 2009


Josh, I would recommend to use a package that supports networks, e.g.
igraph, but there are others as well.

You can read in the data using 'read.csv()', transform it to a matrix
with 'as.matrix()', and then create an igraph object from it with
'graph.adjacency()'.

Then call 'minimum.spanning.tree()' to calculate the tree and 'plot()'
to plot it. E.g. for your example file:

library(igraph)
D <- read.csv("/tmp/matrix.csv")
D <- D[,-1]       # we don't need the first column
G <- graph.adjacency(as.matrix(D), weighted=TRUE)

## Some graphical parameters
V(G)$label <- V(G)$name
V(G)$shape <- "rectangle"
V(G)$color <- "white"
V(G)$size <- 40

## MST and plot
mst <- minimum.spanning.tree(G)
lay <- layout.reingold.tilford(G, mode="all")
plot(mst, layout=lay)

Best,
Gabor

On Tue, Apr 7, 2009 at 8:00 PM, jpearl01 <joshearl1 at hotmail.com> wrote:
>
> Hi all, I'm very new to R and read a few tutorials, however I'm having
> difficulty trying to figure out how to plot a minimum spanning tree.  I have
> a csv file that contains an n-by-n matrix of distances between strains of
> bacteria called matrix.csv.
>
> Looks like:
> id,strain1, strain2,strain3
> strain1,0,.2,.8
> strain2,.3,0,.7
> strain3,.4,.6,0
>
> I've been messing around with some information I've found on the web that
> prints out an mst, however I think it does it with random values, instead of
> values provided in a dataset (like from my csv file).  Here's what I tried
> looks like:
> x <- runif(100)
> y <- runif(100)
> nearest_neighbour <- function (x, y, d=dist(cbind(x,y)), ...) {
>  n <- length(x)
>  stopifnot(length(x) == length(y))
>  d <- as.matrix(d)
>  stopifnot( dim(d)[1] == dim(d)[2] )
>  stopifnot( length(x) == dim(d)[1] )
>  i <- 1:n
>  j <- apply(d, 2, function (a) order(a)[2])
>  segments(x[i], y[i], x[j], y[j], ...)
> }
> plot(x, y,
>     main="Nearest neighbour graph",
>     xlab = "", ylab = "")
> nearest_neighbour(x, y)
>
> This gets the Nearest Neighbors, and then:
>
> plot(x, y,
>     main = "Minimum spanning tree",
>     xlab = "", ylab = "")
> nearest_neighbour(x, y, lwd=10, col="grey")
> points(x,y)
> library(ape)
> r <- mst(dist(cbind(x, y)))
> i <- which(r==1, arr.ind=TRUE )
> segments(x[i[,1]], y[i[,1]], x[i[,2]], y[i[,2]],
>         lwd = 2, col = "blue")
>
> This prints the mst.  This would be perfect for me!  However I don't know
> how to make this use my dataset... Any help (including links to helpful
> tutorials!) would be awesome,
>
> Thanks!
> ~josh
>
>
>
> --
> View this message in context: http://www.nabble.com/Minimum-Spanning-Tree-tp22934813p22934813.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Gabor Csardi <Gabor.Csardi at unil.ch>     UNIL DGM




More information about the R-help mailing list