[R] Thinking about using two y-scales on your plot?

Richard Cotton Richard.Cotton at hsl.gov.uk
Mon Apr 7 12:24:23 CEST 2008

thegeologician wrote:
> A plot of the actual temperature during a year (or thousands of years, 
> as people in palaeoclimate-studies are rather used to) is just so much 
> more intuitive, than some correlation-coefficients or such. I know I'm 
> largely speaking to statisticians in this forum, but in Earth Sciences, 
> most people aren't... I see the use of correlation coefficients and 
> -plots in proofing that an apparent correlation is "real", but the first 
> question upon presenting any statistic analysis is always "What does the 
> DATA look like?".

Agreed - the data itself is much easier to get to grips with than
correlation coefficients.

thegeologician wrote:
> Of course, these plots could be plotted separately with a common x-axis, 
> it's just a matter of saving space and of being used to that kind of 
> graph. I can't imagine anyone being falsely lead to a thought like "oh 
> gosh, the temperature is much higher/bigger/more than the 
> precipitation!" - that makes no sense. I do see the point in graphs 
> where values are plotted together, whose possible interaction with each 
> other might lead to wrong conclusions. Then, it might not be obvious 
> that one is drawing a senseless conclusion.

I think in the temperature/ precipitation case, whether to draw multiple
y-axes or not is a fairly minor decision.  The reader would have to be
pretty dumb to assume that temperatures and precipitations can be compared. 
The point is that it can appear that way - so the reader has to engage their
brain to tell themselves "ignore the obvious comparisons between the lines
that I perceive".  This is clearly not a desirable trait in a graph.

I've concocted an example to show that it's possible to mislead unwary
readers by changing the y-axes scale.

This uses the nottem temperature dataset built into R, and some made-up
precipitation data.

#Generate some precipitation data
precipitation =
pts <- ts(precipitation, start=1920, frequency=12)

#First plot, correlation is apparent
plot(pts, axes=FALSE, col="blue", ylab="")

#Second plot, scale changing makes it appear that precipitation does not
vary with temperature.
plot(pts, axes=FALSE, col="blue", ylab="", ylim=c(0,10000))

I'm willing to concede that the attempt at misleading the audience is pretty
artificial, and not very subtle.  A more dangerous case would be the
opposite situation - making a correlation become visible on a plot where
none really exists, by fiddling with axes tranformations (you could use a
log scale on the second y-axis, or any other transformation you wished).

I suspect that the popularity of multiple y-axes arose from a greater need
to save space in paper-based journals, but in the age of electronic
documents, is space saving really that important? 


Mathematical Sciences Unit
View this message in context: http://www.nabble.com/Thinking-about-using-two-y-scales-on-your-plot--tp16290293p16537217.html
Sent from the R help mailing list archive at Nabble.com.

More information about the R-help mailing list