[R-sig-Geo] Creating density heatmaps for geographical data
Karl Ove Hufthammer
karl at huftis.org
Wed Oct 13 16:30:17 CEST 2010
Dear list members,
Does anybody have suggestions for the best way of creating density heatmaps
for geographical data? With ‘density heatmaps’ I’m not thinking of heatmaps
as in R’s ‘heatmap’ function, but more along the lines of
http://www.heatmapapi.com/
or
http://blog.spatialkey.com/2010/02/comparing-thematic-maps-with-density-
heatmaps/
(Yes, basically, it’s just a 2D density estimate overlain a map.)
I *have* managed to create an OK heatmap using ‘sm.density’ to get a density
estimate, converting it a format ‘levelplot’ understands, plotting it with
‘levelplot’, with the ‘panel’ function set to include a ‘sp.polygon’
function to first draw a simple map before drawing the actual density with
‘panel.levelplot’.
I’m not entirely happy with this approach, though. First note that the main
reason I use ‘sm.density’ instead of ‘kde2d’ is that ‘sm.density’ accepts a
‘weights’ arguments. Since each of my observations contains frequency/weight
information, I need this, and ‘sm.density’ seems to handle it well.
However, since the ‘heatmap’ is really a density estimate, it integrates to
1. I would instead want the heatmap colours to correspond to real
frequencies (of course, they do, but the actual mapping is not visible on
the colour scale). It would also be nice if I could set the range used for
the colour scale directly, so that maps created at different times would be
comparable in terms of their visual appearance (e.g., dark red would always
correspond to a value of 1000). Right now I use the bpy.colors palette, but
other suggestions are very welcome.
Another problem is that ‘sm.density’ uses a Gaussian kernel. This means that
my *entire* map is coloured, even areas far from any observations (and of
course overlapping the background map). My solution to this is to change
small values of the density estimate to NA values, but selecting the right
threshold is a rather manual procedure, and a value that works for one
density map may not work for a map for a different place, or at a different
time. The best solution would be a function that used a kernel with finite
support, so that any areas with a distance greater than the bandwidth from
the nearest observation would be transparent (have NA values).
Also, any special issues I should be aware of since this is *geographical*
data? I’m planning to use a UTM projection to avoid the problem with the
length of longitudes varying with the latitude. My data cover a rather large
area, so this is not a perfect solution, but should be good enough. (But if
there are any functions that can calculate heatmaps taking into account the
shape of the Earth, this would of course great).
So, any suggestions or experience with this type of plotting problem?
Perhaps I’m missing something obvious? :)
--
Regards,
Karl Ove Hufthammer
More information about the R-sig-Geo
mailing list