[R-sig-Geo] Counting points within polygons

Edzer Pebesma edzer.pebesma at uni-muenster.de
Tue Oct 18 21:30:15 CEST 2011



On 10/18/2011 06:32 PM, Agustin Lobo wrote:
> I'm afraid I do not understand well the help page.
> It seems to me that what I need is
> "x = "SpatialPointsDataFrame", y = "SpatialPolygons"
>   equal to the previous method, except that an argument fn=xxx is
> allowed, e.g. fn = mean which will then report a data.frame with the
> mean values of the x points falling in each polygon (set) of y"

No, you need the other way around -- this is what I got wrong with 
overlay, long ago, where argument order did not matter. over(a,b) is the 
opposite of over(b,a)

The text of the "over" documentation is hard to read, and I find the 
concept hard to explain in words, without giving examples. That's why I 
wrote the vignette.

> Thus:
>> delme4<- over(fireLC2[,5], BorneoLC,fn=sum)
> where
>> BorneoLC at data[1,]
>    LC   ID LC_2         LCname LCabrv    AREA PERIMETER
> 0  4 4505    4 Lowland Forest    LwF 7.7e-05  0.036385
> and
>> fireLC2 at data[1,]
>    LC ID                      LCname LCabrv nbp
> 1  7  1 Plantation_Secondary Forest  PltSF   1
>
> but I get
> Error in FUN(X[[1L]], ...) :
>    only defined on a data frame with all numeric variables
> Calls: over ... do.call ->  lapply ->  Summary.data.frame ->  lapply ->  FUN
>
It would be helpful if you could provide a reproducible script that 
leads to this error, or even the command that gave you this error.

> which I do not understand because fireLC2[,5] is numeric
> (suggestion: could the user define the field(s) for which he wants fn
> being applied? )

How about passing an object with numeric attributes (selected)?

> Beyond that, where the help page says:
> "...which will then report a data.frame with the mean values of the x
> points falling in each polygon (set) of y",
> what happens if there is no points falling for a given polygon? Would
> the record be eliminated or kept as NA ?
>> From my point of view, having the option of getting a data.frame with
> the same nb. of rows and same ordering as
> the input Sp Polygons object would be an advantage.

That is why over was written to replace overlay.

I modified the example in "over" to illustrate your two cases:

############ start script
require(sp)
r1 = cbind(c(180114, 180553, 181127, 181477, 181294, 181007, 180409,
180162, 180114), c(332349, 332057, 332342, 333250, 333558, 333676,
332618, 332413, 332349))
r2 = cbind(c(180042, 180545, 180553, 180314, 179955, 179142, 179437,
179524, 179979, 180042), c(332373, 332026, 331426, 330889, 330683,
331133, 331623, 332152, 332357, 332373))
r3 = cbind(c(179110, 179907, 180433, 180712, 180752, 180329, 179875,
179668, 179572, 179269, 178879, 178600, 178544, 179046, 179110),
c(331086, 330620, 330494, 330265, 330075, 330233, 330336, 330004,
329783, 329665, 329720, 329933, 330478, 331062, 331086))
r4 = cbind(c(180304, 180403,179632,179420,180304),
c(332791, 333204, 333635, 333058, 332791))

sr1=Polygons(list(Polygon(r1)),"r1")
sr2=Polygons(list(Polygon(r2)),"r2")
sr3=Polygons(list(Polygon(r3)),"r3")
sr4=Polygons(list(Polygon(r4)),"r4")
sr=SpatialPolygons(list(sr1,sr2,sr3,sr4))
srdf=SpatialPolygonsDataFrame(sr, data.frame(cbind(1:4,5:2), 
row.names=c("r1","r2","r3","r4")))

data(meuse)
coordinates(meuse) = ~x+y

plot(meuse)
polygon(r1)
polygon(r2)
polygon(r3)
polygon(r4)
# retrieve mean heavy metal concentrations per polygon:
# attribute means over each polygon, NA for empty
over(sr, meuse[,1:4], fn = mean)

# return the number of points in each polygon:
sapply(over(sr, geometry(meuse), returnList = TRUE), length)
######## end script

Get some grip on "over". Forget overlay, it will get deprecated. "over" 
doesn't stop with sp; you'll find further "over" methods in packages 
rgeos and spacetime.

Hth,
-- 
Edzer Pebesma
Institute for Geoinformatics (ifgi), University of Münster
Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
8333081, Fax: +49 251 8339763  http://ifgi.uni-muenster.de
http://www.52north.org/geostatistics      e.pebesma at wwu.de



More information about the R-sig-Geo mailing list