[R-sig-Geo] spatially balanced sampling to reduce geo-political bias

Robert J. Hijmans r.hijmans at gmail.com
Mon May 31 03:48:32 CEST 2010


Dear Claire,

The below function uses a raster to stratify the region.
You can tune it by changing the resolution of the raster (res) and/or
n (the number of samples to take from each cell).


library(raster)

gridsample <- function(xy, res, n=1) {
	r = raster(extent(range(xy[,1]), range(xy[,2])) + res)
	res(r) = res
	cell = cellFromXY(r, xy)
	uc = unique(cell)
	xy = cbind(xy, cell, runif( nrow(xy)))
	xy =  xy[order(xy[,4]), ]
	pts = matrix(nrow=0, ncol=2)
	for (u in uc) {
		ss = subset(xy, xy[,3] == u)
		pts = rbind(pts, ss[1:min(n, nrow(ss)), 1:2])
	}
	return(pts)
}

x = rnorm(1000, 10, 5)
y = rnorm(1000, 50, 5)
xy = cbind(x,y)

# change the value of res (and/or n) untill nrow(s)
# approximates your desired number of samples
samp = gridsample(xy, res=5, n=1)
nrow(samp)

plot(xy, cex=0.1)
points(samp, pch='x', col='red')




On Sat, May 29, 2010 at 2:43 AM, Claire Teeling <CXT755 at bham.ac.uk> wrote:
> Dear all,
>
> I am carrying out some species distribution modelling based on a database of
> species occurrence records of a single tree species, encompassing the entire European
> continent. The records are primarily historical and heavily biased towards
> western, northern Europe. A few of the counts of records by country are shown
> below to illustrate.
>
> CHE     12
> CZE     1
> DEN     6
> DEU     1742
> DNK     12
> ESP     237
> FIN     1
> FRA     6536
> GBR     3294
> GEO     39
> GRC     47
> HUN     2
>
> I am very new to R and I'm trying to find a way to subsample in order to obtain
> a more spatially balanced sample of 300 records, from a total of 16794. I have
> looked at some packages, e.g. sp, spcosa, spsurvey, spdep, have searched the
> manuals and searched for similar examples.
> I have also tried to stratify the data but can't find a stratum which reduces
> the impact of the bias. I also have a field containing inclusion probabilities
> for each record, based on country.
> I just can't seem to work out how best to perform sampling to reduce the effect
> of geopolitical bias.
> Any advice, for an R novice, would be very gratefully received.
>
> Thanks,
>
> Claire
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>



More information about the R-sig-Geo mailing list