[R-sig-Geo] Spatial overlay vs. extract and aggregation methods
Juta Kawalerowicz
juta.kawalerowicz at nuffield.ox.ac.uk
Mon May 18 16:10:09 CEST 2015
P.S I found a solution which involves overlay and is faster than raster and
extract functions. I think it may be useful in situations when you try to
aggregate (many) small units over boundaries of some larger units,
conditioned on smaller units being completely nested within larger ones
(i.e there are no cases where a small unit falls in more than one larger
unit). I used this to recalculate boundaries between 2001 and 2011 UK
census, using 2001 Output Areas as spdf1 and electoral boundaries 2011 as
spdf2.
1) calculate centroids for spdf1, overlay with spdf2
nc.sids <- readShapeSpatial(system.file("shapes/sids.shp",
package="maptools")[1],
IDvar="FIPSNO", proj4string=CRS("+proj=longlat
+ellps=clrk66"))
nc.sids <- nc.sids[c("FIPSNO", "NAME" ,"BIR74")]
centroids = gCentroid(nc.sids,byid=TRUE)
overlay<-over(centroids, nc.sids)
sapply(over(nc.sids, geometry(centroids), returnList = TRUE), length)
vect<-(overlay[[1]])
for (i in 1:length(vect)){
nc.sids$overlay[i]<-vect[i]
}
aggregated<-tapply(nc.sids at data$BIR74, nc.sids at data$overlay, sum)
cor(nc.sids at data$BIR74, aggregated)
On Tue, Apr 28, 2015 at 1:47 PM, Edzer Pebesma <
edzer.pebesma at uni-muenster.de> wrote:
>
>
> On 04/23/2015 07:04 PM, Juta Kawalerowicz wrote:
> > Hi,
> >
> > I have a spatial polygon data frame (spdf1) and try to aggregate
> > information from it over polygons in a larger spatial polygon data
> > frame(spdf2). I wanted to run some simple test to see that my code is
> sound
> > (on nc.sids from maptools), but run into a problem which I don't
> > understand. As far as I can see, there are 2 ways to do this:
> >
> > 1) rasterise the smaller spdf1, extract and calculate over larger spdf2
> >
> > library(maptools)
> > library(raster)
> > nc.sids <- readShapeSpatial(system.file("shapes/sids.shp",
> > package="maptools")[1],
> > IDvar="FIPSNO",
> proj4string=CRS("+proj=longlat
> > +ellps=clrk66"))
> >
> > #plot(nc.sids)
> > r<-raster(ncol=180, nrow=180)
> > extent(r)<-extent(nc.sids)
> > rp<-rasterize(nc.sids, r, 'BIR74')
> > plot(rp)
> > plot(nc.sids, add=TRUE)
> >
> > #this will take time on larger files...
> > v <- extract(rp, nc.sids, weights=TRUE)
> > res<-sapply(v, function(x) if (!is.null(x)) {sum(apply(x, 1, prod),
> > na.rm=TRUE) / sum(x[,2])} else NA )
> > cor(res, nc.sids$BIR74)
> > #seems to be working ok.
> >
> > 2) use over function from sp package
> >
> > overlay<-over(nc.sids, nc.sids)
> > #this should return the same indices of nc.sids which fall inside
> nc.sids -
> > basically should be falling within itself for each polygon, right? But
> this
> > is not what I get.
>
> no, it uses gIntersects and picks the first intersecting polygon, which
> might well be a neighbour. (Actually, over(nc.sids, geometry(nc.sids))
> should return indexes and this call the corresponding attribute entries;
> this is a bug, now fixed in rgeos on r-forge).
>
> https://stat.ethz.ch/pipermail/r-sig-geo/2015-April/022636.html
>
> has some pointers to the weighted aggregation problem of non-matching
> polygons; sp 1.1-0 has the rgeos implementation mentioned
> there, as option with sp::aggregate; see the last example of
>
> example(aggregate).
>
> >
> > Any hints would be greatly appreciated.
> >
> > Cheers,
> > Juta
> >
> > [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-Geo mailing list
> > R-sig-Geo at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >
>
> --
> Edzer Pebesma
> Institute for Geoinformatics (ifgi), University of Münster,
> Heisenbergstraße 2, 48149 Münster, Germany; +49 251 83 33081
> Journal of Statistical Software: http://www.jstatsoft.org/
> Computers & Geosciences: http://elsevier.com/locate/cageo/
> Spatial Statistics Society http://www.spatialstatistics.info
>
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>
[[alternative HTML version deleted]]
More information about the R-sig-Geo
mailing list