[R-sig-Geo] raster[] slow on large rasters

Kenny Bell kmb56 at berkeley.edu
Sun Oct 2 23:25:15 CEST 2016


I am trying to sample points from a large RasterLayer (~100GB if read into
memory).

raster::sampleRandom relies on raster raster:::.readCellsGDAL which seems
to loop through rows, read in entire columns using rgdal::getRasterData,
and subset those columns in R.

Sampling 100000 pts from this raster is only a few per column, so this
isn't efficient.

Using my own random numbers with `[` also relies on raster:::.readCellsGDAL.

Does anyone have a suggestion for a better practice?

The raster is public so this code should be reproducible:

download:
ftp://ftp.nass.usda.gov/download/res/2015_30m_cdls.zip

cdl <- raster("2015_30m_cdls/2015_30m_cdls.img")
raster::sampleRandom(cdl, size = 100000) # slow

Cheers,
Kenny

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list