[R-sig-Geo] Counting points within polygons

Agustin Lobo alobolistas at gmail.com
Wed Oct 19 10:39:42 CEST 2011


well, what I sent is actually the same than you had written at the end
of your message, I went too fast and thought
it was the example of the help page of overlay.
Sorry for the noise
Agus

2011/10/19 Agustin Lobo <alobolistas at gmail.com>:
> You are right the vignette is pretty clear.
> I think this is a more elegant solution (BorneoLC is the Sp Pol DF and
> fireLC2 the Sp Points DF):
>
>> delme6 <- over(BorneoLC, fireLC2, returnList = TRUE)
>> nbp <- sapply(delme6,nrow)
>> newdata <- data.frame(BorneoLC at data,nbp=nbp)
>> BorneoLC2 <- BorneoLC
>> BorneoLC2 at data <- newdata
>> BorneoLC2 at data[c(1,1000),]
>    LC   ID LC_2         LCname LCabrv     AREA PERIMETER nbp
> 0    4 4505    4 Lowland Forest    LwF 0.000077  0.036385   0
> 999  8 6745    8 Lowland Mosaic    LwM 0.007168  0.864000   2
>
> Perhaps having an sp function for the specific task of counting points
> within polygons would make sense, I think that
> would be quite popular.
>
> Agus
>
>
> 2011/10/18 Edzer Pebesma <edzer.pebesma at uni-muenster.de>:
>>
>>
>> On 10/18/2011 06:32 PM, Agustin Lobo wrote:
>>>
>>> I'm afraid I do not understand well the help page.
>>> It seems to me that what I need is
>>> "x = "SpatialPointsDataFrame", y = "SpatialPolygons"
>>>  equal to the previous method, except that an argument fn=xxx is
>>> allowed, e.g. fn = mean which will then report a data.frame with the
>>> mean values of the x points falling in each polygon (set) of y"
>>
>> No, you need the other way around -- this is what I got wrong with overlay,
>> long ago, where argument order did not matter. over(a,b) is the opposite of
>> over(b,a)
>>
>> The text of the "over" documentation is hard to read, and I find the concept
>> hard to explain in words, without giving examples. That's why I wrote the
>> vignette.
>>
>>> Thus:
>>>>
>>>> delme4<- over(fireLC2[,5], BorneoLC,fn=sum)
>>>
>>> where
>>>>
>>>> BorneoLC at data[1,]
>>>
>>>   LC   ID LC_2         LCname LCabrv    AREA PERIMETER
>>> 0  4 4505    4 Lowland Forest    LwF 7.7e-05  0.036385
>>> and
>>>>
>>>> fireLC2 at data[1,]
>>>
>>>   LC ID                      LCname LCabrv nbp
>>> 1  7  1 Plantation_Secondary Forest  PltSF   1
>>>
>>> but I get
>>> Error in FUN(X[[1L]], ...) :
>>>   only defined on a data frame with all numeric variables
>>> Calls: over ... do.call ->  lapply ->  Summary.data.frame ->  lapply ->
>>>  FUN
>>>
>> It would be helpful if you could provide a reproducible script that leads to
>> this error, or even the command that gave you this error.
>>
>>> which I do not understand because fireLC2[,5] is numeric
>>> (suggestion: could the user define the field(s) for which he wants fn
>>> being applied? )
>>
>> How about passing an object with numeric attributes (selected)?
>>
>>> Beyond that, where the help page says:
>>> "...which will then report a data.frame with the mean values of the x
>>> points falling in each polygon (set) of y",
>>> what happens if there is no points falling for a given polygon? Would
>>> the record be eliminated or kept as NA ?
>>>>
>>>> From my point of view, having the option of getting a data.frame with
>>>
>>> the same nb. of rows and same ordering as
>>> the input Sp Polygons object would be an advantage.
>>
>> That is why over was written to replace overlay.
>>
>> I modified the example in "over" to illustrate your two cases:
>>
>> ############ start script
>> require(sp)
>> r1 = cbind(c(180114, 180553, 181127, 181477, 181294, 181007, 180409,
>> 180162, 180114), c(332349, 332057, 332342, 333250, 333558, 333676,
>> 332618, 332413, 332349))
>> r2 = cbind(c(180042, 180545, 180553, 180314, 179955, 179142, 179437,
>> 179524, 179979, 180042), c(332373, 332026, 331426, 330889, 330683,
>> 331133, 331623, 332152, 332357, 332373))
>> r3 = cbind(c(179110, 179907, 180433, 180712, 180752, 180329, 179875,
>> 179668, 179572, 179269, 178879, 178600, 178544, 179046, 179110),
>> c(331086, 330620, 330494, 330265, 330075, 330233, 330336, 330004,
>> 329783, 329665, 329720, 329933, 330478, 331062, 331086))
>> r4 = cbind(c(180304, 180403,179632,179420,180304),
>> c(332791, 333204, 333635, 333058, 332791))
>>
>> sr1=Polygons(list(Polygon(r1)),"r1")
>> sr2=Polygons(list(Polygon(r2)),"r2")
>> sr3=Polygons(list(Polygon(r3)),"r3")
>> sr4=Polygons(list(Polygon(r4)),"r4")
>> sr=SpatialPolygons(list(sr1,sr2,sr3,sr4))
>> srdf=SpatialPolygonsDataFrame(sr, data.frame(cbind(1:4,5:2),
>> row.names=c("r1","r2","r3","r4")))
>>
>> data(meuse)
>> coordinates(meuse) = ~x+y
>>
>> plot(meuse)
>> polygon(r1)
>> polygon(r2)
>> polygon(r3)
>> polygon(r4)
>> # retrieve mean heavy metal concentrations per polygon:
>> # attribute means over each polygon, NA for empty
>> over(sr, meuse[,1:4], fn = mean)
>>
>> # return the number of points in each polygon:
>> sapply(over(sr, geometry(meuse), returnList = TRUE), length)
>> ######## end script
>>
>> Get some grip on "over". Forget overlay, it will get deprecated. "over"
>> doesn't stop with sp; you'll find further "over" methods in packages rgeos
>> and spacetime.
>>
>> Hth,
>> --
>> Edzer Pebesma
>> Institute for Geoinformatics (ifgi), University of Münster
>> Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
>> 8333081, Fax: +49 251 8339763  http://ifgi.uni-muenster.de
>> http://www.52north.org/geostatistics      e.pebesma at wwu.de
>>
>



More information about the R-sig-Geo mailing list