[R-sig-Geo] How to extract coordinates values from a shapefile?
Paul Hiemstra
p.hiemstra at geo.uu.nl
Fri Jun 11 11:43:34 CEST 2010
On 06/09/2010 11:01 PM, Dylan Beaudette wrote:
> On Wednesday 09 June 2010, Hadley Wickham wrote:
>
>> On Wed, Jun 9, 2010 at 3:24 PM, Edzer Pebesma
>>
>> <edzer.pebesma at uni-muenster.de> wrote:
>>
>>> The example provided by Matt assumes that each polygon consists of a
>>> single ring, and doesn't have islands, lakes etc. The function below
>>> pasts all coordinates to a single 2-column matrix. For clarity's sake, I
>>> avoided nested sapply's.
>>>
>>> library(maptools)
>>> xx<- readShapePoly(system.file("shapes/sids.shp",
>>> package="maptools")[1], IDvar="FIPSNO", proj4string=CRS("+proj=longlat
>>> +ellps=clrk66")) allcoordinates = function(x) {
>>> ret = NULL
>>> polys = x at polygons
>>> for(i in 1:length(polys)) {
>>> pp = polys[[i]]@Polygons
>>> for (j in 1:length(pp))
>>> ret = rbind(ret, coordinates(pp[[j]]))
>>> }
>>> ret
>>> }
>>> q = allcoordinates(xx)
>>>
>>> # check:
>>> plot(xx)
>>> lines(q, col='blue')
>>>
>> And that's basically what fortify does, except it covers a few more cases.
>>
>> Hadley
>>
> Not to be nit-picky, but wouldn't it be safer to pre-allocated "ret" instead
> of dynamically appending? This would allow the suggested function to scale to
> very large geometries.
>
In addition to my previous suggestion, I tried using lapply + do.call
instead of the for loop based implementation of edzer:
library(maptools)
xx<- readShapePoly(system.file("shapes/sids.shp",
package="maptools")[1], IDvar="FIPSNO", proj4string=CRS("+proj=longlat
+ellps=clrk66"))
allcoordinates = function(x) {
ret = NULL
polys = x at polygons
for(i in 1:length(polys)) {
pp = polys[[i]]@Polygons
for (j in 1:length(pp))
ret = rbind(ret, coordinates(pp[[j]]))
}
ret
}
allcoordinates_lapply = function(x) {
polys = x at polygons
return(do.call("rbind", lapply(polys, function(pp) {
do.call("rbind", lapply(pp at Polygons, coordinates))
})))
}
q = allcoordinates(xx)
z = allcoordinates_lapply(xx)
all.equal(q,z)
So far it produces the same output. And now comes the nice part, doing a
test with a large polygon set and timing the performance of both functions:
> system.time(x <- allcoordinates(nuts))
user system elapsed
22.781 8.572 31.422
> system.time(y <- allcoordinates_lapply(nuts))
user system elapsed
0.248 0.004 0.250
> all.equal(x,y)
[1] TRUE
So apart from the imo nicer looking function in terms of syntax, it is
also several orders of magnitude faster. This effect will only become
stronger if the polygon set becomes larger. So, lapply + do.call is your
friend :).
cheers,
Paul
> sessionInfo()
R version 2.11.0 (2010-04-22)
i486-pc-linux-gnu
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] maptools_0.7-34 lattice_0.18-5 sp_0.9-62 foreign_0.8-40
loaded via a namespace (and not attached):
[1] grid_2.11.0
>
> Cheers,
> Dylan
>
>
>
>
--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone: +3130 274 3113 Mon-Tue
Phone: +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770
More information about the R-sig-Geo
mailing list