[R-sig-Geo] Subsetting RasterBrick very slow

Rick Reeves reeves at nceas.ucsb.edu
Fri Jul 1 18:20:28 CEST 2011


Hello Christian:
Yes indeed, my experience is that Raster package operations on very 
large datasets take a long time,
on either Linux or Windows desktops.

As an alternative to the Raster package, I have had good experience 
using the GDAL raster image utilities,
embedded within shell or Python scripts to perform first-order raster 
subsampling and subimage generation.
Check out the gdalinfo, gdal_translate, and gdalwarp utilities. I have 
been using these to work with 100 gigabyte-class
raster DEM images, with great success and relatively low execution times 
(considering the size of the data sets).

HTH,
Rick Reeves
NCEAS



On 7/1/2011 7:02 AM, Christian Kamenik wrote:
> Dear all,
>
> The package 'raster' provides excellent tools to work on geographic 
> data. Subsetting a RasterBrick object, however, takes ages.
>
> I have the following RasterBrick object:
>
>     Temp.CRU.cropped
>
>     class:                  RasterBrick
>     filename:
>     dimensions:        43, 99, 1200  (nrow, ncol, nlayers)
>     ncell:                  4257
>     projection:         +proj=longlat +ellps=WGS84
>     min values:        -14.8 -22.1 -15.0  -8.6  -3.8   4.7   8.7   
> 6.0   1.6  -2.4 ...
>     max values:       2.1 -3.9 -2.1  1.8  6.6 15.7 18.0 14.8 10.7  7.7 
> ...
>     extent:               15, 31.5, 64, 71.16667  (xmin, xmax, ymin, 
> ymax)
>     resolution:         0.1666667, 0.1666667  (x, y)
>
> These are temperature readings from northern Fennoscandia, and each 
> layer represents a month, thus each grid cell has monthly readings 
> over the last 100 years.
>
> Now I want to extract the readings for only July:
>
>     Months <- rep(month.abb,nlayers(Temp.CRU.cropped)/12)
>     Temp.CRU.J <- subset(Temp.CRU.cropped,which(Months=='Jul'))
>
> But as I said, this takes ages. It is much faster to do the subsetting 
> on an array:
>
>     CRU.values <- getValues(Temp.CRU.cropped)
>     Temp.CRU.J <- CRU.values[,which(Months=='Jul')]
>     Temp.CRU.J <- 
> array(Temp.CRU.J,dim=c(nrow(Temp.CRU.cropped),ncol(Temp.CRU.cropped),nlayers(Temp.CRU.cropped)/12))
>     Temp.CRU.J <- brick(Temp.CRU.J)
>     Temp.CRU.J <- setExtent(Temp.CRU.J,extent(Temp.CRU.cropped))
>     projection(Temp.CRU.J) <- projection(Temp.CRU.cropped)
>
> It is strange that several lines of code run much faster than the 
> 'subset' command, and I was wondering why 'subset' takes at least a 
> 1000 times longer.
>
> I would be interested in your opinion.
>
> Many thanks, Christian
>



More information about the R-sig-Geo mailing list