[R-sig-Geo] over() in sp package

Kenny Bell kmb56 at berkeley.edu
Fri Apr 17 05:56:19 CEST 2015


Thanks so much for this.

The code does work. However, the gIntersection() stage is very time
intensive, so I will continue to use my own code that relies on joinPolys
in PBSmapping (faster).

It also allows you to store the rows/weights of the grid so that
aggregating many grids to polygons is feasible.

I have compared the two for aggregating a temperature grid up to an Alabama
polygon and they match, so your code was extremely helpful for verification
- Thanks!

The code is in an R package on my Github
<https://github.com/kendonB/gistools>, for anyone interested. This is my
first R package I've shared, so please make comments through the issues
system on Github.

On Wed, Apr 15, 2015 at 3:19 PM, Edzer Pebesma <
edzer.pebesma at uni-muenster.de> wrote:

>
> On 04/14/2015 06:58 PM, Kenny Bell wrote:
> > Thanks so much for all this. Yes, I'm after the area weighted averages.
> The
> > eventual application is to aggregate a 2.5 arcminute temperature grid up
> to
> > a zip-5 polygon, many of which overlap only 2-3 grid cells.
>
> One approach would be to use spsample to resample the temperature to a
> finer grid, then use aggregate.
>
> Another approach would be to convert the grid to SpatialPolygons, and
> then use the function below, which should do exact area weighting. Let
> me know if it works, or if you have other comments / questions.
>
>
> aggregatePoly = function(x, by) {
>         require(rgeos)
>         i = gIntersection(x, by, byid = TRUE)
>         ids.i = sapply(i at polygons, function(x) slot(x, name = "ID"))
>         grp_by = sapply(strsplit(ids.i, " "), function(x) x[2])
>
>         obs = over(SpatialPoints(coordinates(i)), x)
>         obs$area = sapply(i at polygons, function(x)
>                 slot(x, name = "area"))
>         obs$x_area = obs[[1]] * obs$area
>         agg = aggregate(obs[c("x_area", "area")], list(grp_by), sum)
>         agg$value = with(agg, x_area / area)
>         agg = agg[match(row.names(by), agg$Group.1),]
>         row.names(agg) = row.names(by)
>         SpatialPolygonsDataFrame(by, agg)
> }
>
> # Example with two partly overlapping grids:
> library(sp)
>
> gt1 = SpatialGrid(GridTopology(c(0,0), c(1,1), c(4,4)))
> gt2 = SpatialGrid(GridTopology(c(-1.25,-1.25), c(1,1), c(4,4)))
>
> # convert both to polygons; give p1 attributes to aggregate
> p1 = SpatialPolygonsDataFrame(as(gt1, "SpatialPolygons"),
>                 data.frame(v = 1:16), match.ID = FALSE)
> p2 = as(gt2, "SpatialPolygons")
>
> # plot the scene:
> plot(p1, xlim = c(-2,4), ylim = c(-2,4))
> plot(p2, add = TRUE, border = 'red')
> library(rgeos)
> i = gIntersection(p1, p2, byid = TRUE)
> plot(i, add=TRUE, density = 5, col = 'blue')
> # plot IDs p2:
> ids.p2 = sapply(p2 at polygons, function(x) slot(x, name = "ID"))
> text(coordinates(p2), ids.p2)
> # plot IDs i:
> ids.i = sapply(i at polygons, function(x) slot(x, name = "ID"))
> text(coordinates(i), ids.i, cex = .8, col = 'blue')
>
> # compute & plot area-weighted average of p1[[1]]
> ret = aggregatePoly(p1, p2)
> spplot(ret["value"])
>
>
> >
> > On Mon, Apr 13, 2015 at 10:47 PM, Edzer Pebesma <
> > edzer.pebesma at uni-muenster.de> wrote:
> >
> >>
> >>
> >> On 04/14/2015 01:27 AM, Kenny Bell wrote:
> >>> Hi,
> >>>
> >>> When I call:
> >>>
> >>> over(spPolyDf, spGridDf)
> >>>
> >>> The function seems to be returning the spGridDf$band1 value for the
> first
> >>> grid cell that is contained in each polygon.
> >>
> >> This is documented; you can use returnList = TRUE to get the list of
> >> values for each polygon. Since you want to aggregate these anyway, you
> >> might as well use
> >>
> >> aggregate(spGridDF, spPolyDf, mean)
> >>
> >> this is gives a simple average of pixels whose centers fall inside the
> >> polygons. Do you need a weighted average, weighted by how much of the
> >> pixel overlaps with each polygon?
> >>
> >>>
> >>> What I would like is an overlap-area weighted average (or some provided
> >>> function like sum) of all the grid cells contained (or partially
> >> contained)
> >>> in each polygon element of spPolyDf.
> >>>
> >>> I have code that does this by eventually using joinPolys() from
> >> PBSMapping,
> >>> but it's slow and would be much cleaner to have a sp-only solution. Is
> >>> there any way to do this? Some example code is below:
> >>>
> >>> library(sp)
> >>> library(rgdal)
> >>>
> >>> # Download a USA shapefile
> >>> tmpFile <- tempfile()
> >>> download.file("
> >>> http://www2.census.gov/geo/tiger/GENZ2013/cb_2013_us_state_20m.zip",
> >>> destfile = tmpFile)
> >>> unzipped <- unzip(zipfile = tmpFile, exdir = getwd())
> >>>
> >>> # This is the actual reading stage
> >>> states <- readOGR(getwd(), layer="cb_2013_us_state_20m")
> >>>
> >>> # Remove all the files on disk
> >>> file.remove(tmpFile)
> >>> file.remove(unzipped)
> >>>
> >>> tmpFile <- tempfile()
> >>>
> >>> url <- paste0("http://www.prism.oregonstate.edu/fetchData.php?type",
> >>>               "=bil&kind=recent&elem=tmax&range=daily&temporal=",
> >>>               "20140416")
> >>>
> >>> # Download the file to fileName
> >>> download.file(url = url, mode = "wb", destfile = tmpFile)
> >>>
> >>> # unzip the file to the working directory.
> >>> unzippedFiles <- unzip(zipfile = tmpFile, overwrite = TRUE)
> >>>
> >>> # read into R
> >>> tMaxGrid <- readGDAL(fname = unzippedFiles[1])
> >>>
> >>> # delete the file
> >>> file.remove(tmpFile)
> >>> file.remove(unzippedFiles)
> >>>
> >>> # This doesn't work correctly
> >>> states$tmax20140416 <- over(states, tMaxGrid)
> >>
> >> this would assign a data.frame to what you want to be a vector (in a
> >> data.frame), so it would have to be:
> >>
> >> states$tmax20140416 <- over(states, tMaxGrid)[[1]]
> >>
> >> I'll try to modify sp so that it'd warn you in this case.
> >>
> >>
> >> --
> >> Edzer Pebesma
> >> Institute for Geoinformatics (ifgi),  University of Münster,
> >> Heisenbergstraße 2, 48149 Münster, Germany; +49 251 83 33081
> >> Journal of Statistical Software:   http://www.jstatsoft.org/
> >> Computers & Geosciences:   http://elsevier.com/locate/cageo/
> >> Spatial Statistics Society http://www.spatialstatistics.info
> >>
> >>
> >> _______________________________________________
> >> R-sig-Geo mailing list
> >> R-sig-Geo at r-project.org
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
> >>
> >>
> >
> >
>
> --
> Edzer Pebesma
> Institute for Geoinformatics (ifgi),  University of Münster,
> Heisenbergstraße 2, 48149 Münster, Germany; +49 251 83 33081
> Journal of Statistical Software:   http://www.jstatsoft.org/
> Computers & Geosciences:   http://elsevier.com/locate/cageo/
> Spatial Statistics Society http://www.spatialstatistics.info
>
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>


-- 
Kendon Bell
Email: kmb56 at berkeley.edu
Phone: (510) 612-3375

Ph.D. Student
Department of Agricultural & Resource Economics
University of California, Berkeley

Graduate Student Researcher
Energy Biosciences Institute
http://www.energybiosciencesinstitute.org/

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list