[R] Plotting from different data sources on the same plot (with ggplot2)
jiho
jo.irisson at gmail.com
Thu Sep 27 12:52:18 CEST 2007
Hello everyone (and Hadley in particular),
I often need to plot data from multiple datasets on the same graph. A
common example is when mapping some values: I want to plot the
underlying map and then add the points. I currently do it with base
graphics, by recording the maximum region in which my map+point will
fit, plotting both with these xlim and ylim parameters, adding par
(new=T) between plot calls and setting the graphical parameters (to
draw axes, titles, to set aspect ratio) by hand. This is not easy nor
practical when the plots become more and more complicated.
The ggplot book specifies that "[ggplot] makes it easy to combine
data from multiple sources". Since I use ggplot2 as much as I can
(thanks it's really really great!) I thought I would try producing
such a plot with ggplot2.
NB: If this is possible/easy with an other plotting package please
let me know. I am not looking for something specific to maps but
rather for a generic mechanism to throw several pieces of data to a
graph and have the plotting routine take care of setting up axes that
will fit all data on the same scale.
So, now for the ggplot2 part. I have two data sources: the
coordinates of the coastlines in a region of interest and the
coordinated of sampling stations in a subset of this region. I want
to plot the coastline as a line and the stations as points, on the
same graph. I can plot them independently easily:
p1 = ggplot(coast,aes(x=lon,y=lat)) + geom_path() + coord_equal(ratio=1)
p1$aspect.ratio = 1
p2 = ggplot(coords,aes(x=lon,y=lat)) + geom_point() + coord_equal
(ratio=1)
p2$aspect.ratio = 1
but I cannot find how to combine the two graphs. I suspect this has
probably to be done via different layers but I really can't find how.
In particular, I would like to know how to deal with the scales: can
ggplot take care of plotting the two datasets on the same coordinates
system or do I have to manually record the maximal range of x and y
and force ggplot to use this on both layers, as I did with base
graphics? (of course I would prefer the former ;) ).
To test it further with real data, here is my code and data:
http://jo.irisson.free.fr/dropbox/test_ggplot2.zip
A small additional precision: I would like the two datasets to stay
separated. Indeed I could probably combine them and plot everything
in one step by clever use of ggplot arguments. However this is just a
simple example and I would like to add more in the future (like
trajectories at each station, points proportional to some value at
each station etc.) so I really want the different data sources to be
separated and to produce the plot in several steps, otherwise it will
soon become too complicated to manage.
Thank you very much in advance for your help.
JiHO
---
http://jo.irisson.free.fr/
More information about the R-help
mailing list