[R] Combined Marimekko/heatmap

Neal Humphrey nhumphrey at clasponline.org
Fri Dec 14 19:03:36 CET 2012


Thomas, 

This is a big help for getting me started. Brand new to R, so I'm unfamiliar with how to 'manually' drawing graphs (instead of packages). 

The graph your code makes is more like a Marimekko chart. What I'm thinking of is like a heatmap but each row has a different width, and each column has a different width. But, for any particular column, the width is the same all the way down. 

I used your code to figure out how to draw lines on a chart the way I need this to look. Now what I need to figure out is how to add color coding to the respective squares. In my example it's binary data (yes/no), but a more robust approach would allow for a true heatmap. 

Any help on getting this to the next step of color coding would be much appreciated! The code below will draw a grid that is scaled per the sample data on the x and y axes. Perhaps what is needed is to draw boxes rather than lines to make the grid? I'm not sure. 

Neal

par(mar=c(1,1,1,1),
 oma = c(0,0,0,0),
  mgp=c(1.5,.2,0),
  tcl=0,
  cex.axis=.75,
  col.axis="black",
  pch=16)

#--------Input the sample data-------------------
Z <- textConnection("
country TWh
Australia 244
Canada 522
China 3503
EU 2500
India 689.5
Japan 997
USA 3960
")
CountryEnergy <- read.table(Z,header=TRUE)
close(Z)

Z <- textConnection("
Product TWh
Freezer 72.4
Refrigerator 379
HVAC 466
Lighting 123
Television 152
")
ProductEnergy <- read.table(Z,header=TRUE)
close(Z)

#-------------Binary data indicating whether that country/product combination has a standard----------------
#-------------Rows correspond to countries (in the same order as the CountryEnergy matrix)------------------
#-------------Columns correspond to products (in the same order as the ProductEnergy matrix)----------------
Z <- textConnection("
country TWh
0 1 0 0 1
0 1 1 1 0
1 0 1 0 0
1 1 1 0 0
0 1 1 1 1
1 0 0 0 1
1 0 0 0 0
")
ddd <- read.table(Z,header=FALSE)
close(Z)

#-----------rewrite the data table so that the vector is numbers only, and label the rows----------
row.names(CountryEnergy) <- CountryEnergy$Country  
CountryEnergy<-CountryEnergy[,2:2]
row.names(ProductEnergy) <- ProductEnergy$Product
ProductEnergy <- ProductEnergy[,2:2]


#-----------plot the grid------------
plot.new()
plot.window(ylim=c(0,sum(CountryEnergy)),xlim=c(0,sum(ProductEnergy)),xaxs = 'i',yaxs='i',las=1)
box()
abline(h = cumsum(CountryEnergy),lwd=2, col="gray60")     #lwd - line width
abline(v = cumsum(ProductEnergy), lwd=2, col="gray60")

labxats <- NULL

#----------Use ddd data to code the cells as yes/no for having a standard---------------------
#
#
#  this is the part I need help with
#
#
#
#----------------------------------------------------------------------------------------------


Neal Humphrey
nhumphrey at clasponline.org

From: Thomas Stewart [mailto:tgs at live.unc.edu] 
Sent: Friday, December 14, 2012 10:36 AM
To: Neal Humphrey
Cc: r-help at r-project.org
Subject: Re: [R] Combined Marimekko/heatmap

Neal-

Perhaps the following code is a start for what you want.

-tgs

par(mar=c(1,1,1,1),
 oma = c(0,0,0,0),
  mgp=c(1.5,.2,0),
  tcl=0,
  cex.axis=.75,
  col.axis="black",
  pch=16)

Z <- textConnection("
country A1 A2 A3
A 3 4 5
B 6 9 8
C 6 9 5")
ddd <- read.table(Z,header=TRUE)
close(Z)


CountryPcts <- rowSums(ddd[,-1]) / sum(ddd[,-1])

plot.new()
plot.window(ylim=c(0,1),xlim=c(0,1),xaxs = 'i',yaxs='i',las=1)
box()
abline(h = cumsum(CountryPcts),lwd=2)

labxats <- NULL
vlines <- ddd[,-1] / sum(ddd[,-1]) / CountryPcts
vlines <- t(apply(vlines,1,cumsum))
yyy <- c(0,rep(cumsum(CountryPcts),each=2))
yyy <- head(yyy,-1)
for(i in 1:nrow(ddd) ){
xxx <- rep(vlines[,i],each=2)
lines(xxx,yyy,col="red",lwd=3)
labxats[i] <- rev(xxx)[1]
}

labxats <- (labxats + c(0,head(labxats,-1)))/2
labyats <- (cumsum(CountryPcts) + c(0,head(cumsum(CountryPcts),-1)))/2
axis(2,at=labyats,labels = ddd[,1],las=1 )
axis(3,at=labxats,labels = colnames(ddd)[-1],las=1 )

On Thu, Dec 13, 2012 at 6:09 PM, Neal Humphrey <nhumphrey at clasponline.org> wrote:
Hi all,

I'm trying to figure out a way to create a data graphic that I haven't ever seen an example of before, but hopefully there's an R package out there for it. The idea is to essentially create a heatmap, but to allow each column and/or row to be a different width, rather than having uniform column and row height. This is sort of like a Marimekko chart in appearance, except that rather than use a single color to represent the category, the color represents a value and all the y-axis heights in each column line up with each other. That way color represents one variable, while the area of the cell represents another.

In my application, my heatmap has discrete categorical data rather than continuous. Rows are countries, columns are appliances, and I want to scale the width and height of each column to be the fraction of global energy consumed by the country and the fraction of energy use consumed by that appliance type. The color coding would then indicate whether or not that appliance is regulated in that country.

Any ideas how to make such a chart, or even what it might be called?


Neal Humphrey
NHumphrey at clasponline.org

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list