[R-sig-Geo] stratified random sampling

Dylan Beaudette dylan.beaudette at gmail.com
Wed Feb 25 17:11:04 CET 2009


On Wed, Feb 25, 2009 at 12:53 AM, Roger Bivand <Roger.Bivand at nhh.no> wrote:
> On Tue, 24 Feb 2009, Heuvelink, Gerard wrote:
>
>> Dear list,
>>
>> The stratified random sampling problem that I submitted a few days ago has
>> already been solved, with the help of several of you, notably Edzer Pebesma.
>> Edzer came up with the following solution:
>
> WARNING!! This is only conditionally correct. See below for analysis.
>
>>
>> library(sp)
>> library(rgdal)
>> nc1 <- readShapePoly(system.file("shapes/sids.shp",package="maptools")[1],
>
> + proj4string=CRS("+proj=longlat +datum=NAD27"))
>>
>> pts = do.call(rbind, sapply(slot(nc1, "polygons"), spsample, n=1,
>
> + type="random"))
>>
>> plot(nc1)
>> points(pts, col='blue', pch=19, cex=1)
>>
>> As it happened, the do.call statement did not work in my case (Edzer and
>> Roger may look into why it does not work with all shapes)
>
> It was because the shapefile was no good, with both self-intersecting and
> overlapping polygons. This left some values returned by spsample in the
> sapply call as NULL (correctly), and no rbind() method exists for
> SpatialPoints and NULL objects.
>
>> and had to be replaced by:
>>
>> for (i in 1:length(slot(nc1, "polygons"))) {
>>   pt = spsample(nc1[i,], n=1, type="random")
>>   if (i == 1)
>>       pts = pt
>>   else
>>       pts = rbind(pts, pt)
>> }
>>
>
> This gets a result - there are other ways of doing it too, but it is not
> what it seems. Because the polygons are self-intersecting and overlapping,
> the point-in-polygon algorithm is not choosing points correctly (spsample
> for a ring generates more points than needed in the bounding box of the
> polygon, and chooses the number needed from those that fall within the
> polygon).
>
> The input shapefile is a vectorised map of soil types from raster data, but
> unfortunately the software used to generate it is unknown, so we can't warn
> people off it. Assuming that only one soil type occurs in each raster cell,
> we can reproduce the case with meuse.grid:
>
> library(sp)
> data(meuse.grid)
> coordinates(meuse.grid) <- c("x", "y")
> gridded(meuse.grid) <- TRUE
> meuse.grid$soil <- factor(meuse.grid$soil)
> spplot(meuse.grid, "soil")
> meuseSP <- as(meuse.grid, "SpatialPolygons")
> ID <- as.character(meuse.grid$soil)
> library(maptools)
> meuseSP1 <- unionSpatialPolygons(meuseSP, ID)
> plot(meuseSP1, col=1:3)
> pts = do.call(rbind, sapply(slot(meuseSP1, "polygons"), spsample, n=1,
>  type="random"))
> points(pts, pch=3, col="white")
>
> Doing the raster to vector conversion in R ought to resolve the underlying
> problem of the soil polygons being generated from the raster values in an
> inappropriate way.

Thanks for bringing this to our attention Roger. I would add that
GRASS can be used to clean topologically broken shapefiles. Or, it can
be used to vectorize raster data such that the resulting vector is
topologically correct.

Cheers,

Dylan




>
>> I am so happy!
>
> Sorry, no comment! I have replied off-list too, but unfortunately bad
> shapefiles really do exist, and they cause lots of problems.
>
> Roger
>
>>
>> Gerard
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>
> --
> Roger Bivand
> Economic Geography Section, Department of Economics, Norwegian School of
> Economics and Business Administration, Helleveien 30, N-5045 Bergen,
> Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
> e-mail: Roger.Bivand at nhh.no
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>



More information about the R-sig-Geo mailing list