[R-sig-Geo] stratified random sampling
Roger Bivand
Roger.Bivand at nhh.no
Wed Feb 25 09:53:36 CET 2009
On Tue, 24 Feb 2009, Heuvelink, Gerard wrote:
> Dear list,
>
> The stratified random sampling problem that I submitted a few days ago
> has already been solved, with the help of several of you, notably Edzer
> Pebesma. Edzer came up with the following solution:
WARNING!! This is only conditionally correct. See below for analysis.
>
> library(sp)
> library(rgdal)
> nc1 <- readShapePoly(system.file("shapes/sids.shp",package="maptools")[1],
+ proj4string=CRS("+proj=longlat +datum=NAD27"))
> pts = do.call(rbind, sapply(slot(nc1, "polygons"), spsample, n=1,
+ type="random"))
> plot(nc1)
> points(pts, col='blue', pch=19, cex=1)
>
> As it happened, the do.call statement did not work in my case (Edzer and
> Roger may look into why it does not work with all shapes)
It was because the shapefile was no good, with both self-intersecting and
overlapping polygons. This left some values returned by spsample in the
sapply call as NULL (correctly), and no rbind() method exists for
SpatialPoints and NULL objects.
> and had to be replaced by:
>
> for (i in 1:length(slot(nc1, "polygons"))) {
> pt = spsample(nc1[i,], n=1, type="random")
> if (i == 1)
> pts = pt
> else
> pts = rbind(pts, pt)
> }
>
This gets a result - there are other ways of doing it too, but it is not
what it seems. Because the polygons are self-intersecting and overlapping,
the point-in-polygon algorithm is not choosing points correctly (spsample
for a ring generates more points than needed in the bounding box of the
polygon, and chooses the number needed from those that fall within the
polygon).
The input shapefile is a vectorised map of soil types from raster data,
but unfortunately the software used to generate it is unknown, so we can't
warn people off it. Assuming that only one soil type occurs in each raster
cell, we can reproduce the case with meuse.grid:
library(sp)
data(meuse.grid)
coordinates(meuse.grid) <- c("x", "y")
gridded(meuse.grid) <- TRUE
meuse.grid$soil <- factor(meuse.grid$soil)
spplot(meuse.grid, "soil")
meuseSP <- as(meuse.grid, "SpatialPolygons")
ID <- as.character(meuse.grid$soil)
library(maptools)
meuseSP1 <- unionSpatialPolygons(meuseSP, ID)
plot(meuseSP1, col=1:3)
pts = do.call(rbind, sapply(slot(meuseSP1, "polygons"), spsample, n=1,
type="random"))
points(pts, pch=3, col="white")
Doing the raster to vector conversion in R ought to resolve the underlying
problem of the soil polygons being generated from the raster values in an
inappropriate way.
> I am so happy!
Sorry, no comment! I have replied off-list too, but unfortunately bad
shapefiles really do exist, and they cause lots of problems.
Roger
>
> Gerard
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the R-sig-Geo
mailing list