[R] gplot heatmaps: clustering according to rowsidecolors + key.xtickfun
Gregory Warnes
greg at warnes.net
Thu Sep 18 00:44:15 CEST 2014
Oops, I forgot to mention that an bug was preventing RowSideColors from
working properly. It is fixed in version 2.14.2 of gplots which I've just
uploaded to CRAN and am attaching to this email.
-Greg
On Wed, Sep 17, 2014 at 5:12 PM, Gregory R. Warnes <greg at warnes.net> wrote:
> Hello Tim,
>
> Sorry about the slow response, I just found this message.
>
> On Sep 4, 2014, at 9:23 AM, Tim Richter-Heitmann <trichter at uni-bremen.de>
> wrote:
>
> Hi there,
>
> I have two questions about heatmap.2 in gplot.
> My input is a simple square matrix with numeric values between 75 and 100
> (it is a similarity matrix based on bacterial DNA sequences).
>
> 1. I can sort my input matrix into categories with rowsidecolors (in this
> case, very conveniently by bacterial taxa). I do a clustering and
> reordering of my matrix by Rowv=TRUE (and Colv="Rowv").
> The question is now, can i combine the two features that the
> clustering/reordering is done only for submatrices defined by the vectors
> given in rowsidecolors (so, in this case, that the original ordering by
> bacterial taxa is preserved)?
>
> That would be very amazing.
>
>
> Hmm. To get the individual species clustered within taxa would require
> doing the hierarchical clustering first separately, then combining the
> dendrograms. This should do the trick:
>
>
> set.seed(1234567)
>
> ## Dummy Distances
> x <- matrix( rnorm(400, mean=87.5, sd=12.5/4), ncol=20)
>
> ## Dummy Taxa
> taxa <- sample(letters[1:4], 20, replace=T)
> taxa <- as.factor(taxa)
>
> # sort the data by taxa
> ord <- order(taxa)
>
> x <- x[ord, ord]
> taxa <- taxa[ord]
> rownames(x) <- 1:nrow(x)
>
>
> ####
> # stats:::merge.dendrogram is broken. This is the corrected version.
> # See R BUG 15648
> # (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=15648) for
> # details
> ####
> merge.dendrogram <- function(x, y, ..., height) {
> stopifnot(inherits(x,"dendrogram"), inherits(y,"dendrogram"))
>
> ### FIX
> inx.add <- function(inx, add) {
> if(is.leaf(inx)) {
> inx <- inx + add
> }
> return(inx)
> }
> y <- dendrapply(y, inx.add, add=max(unlist(x)))
> ### FIX
>
> r <- list(x,y)
> if(length(xtr <- list(...))) {
> if(!all(is.d <- vapply(xtr, inherits, NA, what="dendrogram"))) {
> xpr <- substitute(c(...))
> nms <- sapply(xpr[-1][!is.d], deparse, nlines = 1L)
> ## do not simplify: xgettext needs this form
> msg <- ngettext(length(nms),
> "extra argument %s is not of class \"%s\"",
> "extra arguments %s are not of class \"%s\"s")
> stop(sprintf(msg, paste(nms, collapse=", "), "dendrogram"),
> domain = NA)
> }
> ## <GRW>
> for(i in 1:length(xtr))
> {
> add <- max(c(unlist(r), unlist(xtr)))
> print(add)
> xtr[[i]] <- dendrapply(xtr[[i]], inx.add, add=add)
> }
> ## </GRW>
> r <- c(r, xtr)
> }
> attr(r, "members") <- sum(vapply(r, attr, 0L, which="members"))
> h.max <- max(vapply(r, attr, 0., which="height"))
> if(missing(height) || is.null(height))
> height <- 1.1 * h.max
> else if(height < h.max) {
> msg <- gettextf("'height' must be at least %g, the maximal height of
> its components", h.max)
> stop(msg, domain = NA)
> }
> attr(r, "height") <- height
> class(r) <- "dendrogram"
> midcache.dendrogram(r, quiet=TRUE)
> }
>
>
> ## Compute dendrograms within each taxum, then merge into a combined
> dendrogram
> dendList <- list()
> for( taxon in levels(taxa) )
> {
> items <- which(taxon==taxa)
> submatrix <- x[ items, items]
> dend <- as.dendrogram(hclust(dist(submatrix)))
> dendList[[taxon]] <- dend
> }
> names(dendList) <- NULL
> dends <- do.call("merge.dendrogram", dendList)
>
> ## Now generate the heatmap
> heatmap.2(x,
> Rowv=dends,
> Colv=dends,
> symm=TRUE,
> RowSideColors=c("red","blue","green","black")[as.numeric(taxa)],
> ColSideColors=c("red","blue","green","black")[as.numeric(taxa)],
> trace="none"
> )
>
> 2. I have set my own coloring rules with:
>
> mypal <- c("grey","blue", "green","yellow","orange","red")
> col_breaks = c(seq(0,74.9), seq(75.0,78.4), seq(78.5,81.9),
> seq(82.0,86.4), seq(86.5, 94.5), seq(94.5,100.0))
>
> Is it possible to pass this sequential ordering to key.xtickfun? May i ask
> for an example code?
>
>
> Use the ‘breaks’ and ‘col’ arguements:
>
>
> ## Custom color key
> mypal <- c("grey","blue", "green","yellow","orange","red")
> col_breaks <- c(0,75.0,78.5,82.0,86.5,94.5,100.0)
>
>
> heatmap.2(x,
> Rowv=dends,
> Colv=dends,
> symm=TRUE,
> RowSideColors=c("red","blue","green","black")[as.numeric(taxa)],
> ColSideColors=c("red","blue","green","black")[as.numeric(taxa)],
> trace="none",
> breaks=col_breaks,
> col=mypal
> )
>
> -Greg
>
>
--
"Whereas true religion and good morals are the only solid foundations of
public liberty and happiness . . . it is hereby earnestly recommended to
the several States to take the most effectual measures for the
encouragement thereof." Continental Congress, 1778
More information about the R-help
mailing list