[R-sig-Geo] Advice on raster workflow: subs -> resample

Fri Oct 29 16:21:21 CEST 2010

Hello,

I am writing to ask for advice as to whether I am following the most
efficient/fastest workflow for a task using the raster package. This
involves 3 datasets:

1. modtest: A ~927 m resolution satellite-derived image:
nrow        : 141
ncol        : 189
ncell       : 26649
min value   : ?
max value   : ?
projection  : +proj=aea +lat_1=-18 +lat_2=-32 +lat_0=0 +lon_0=24
+x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs
+towgs84=0,0,0
xmin        : 415851.2
xmax        : 590983.4
ymin        : -2670093
ymax        : -2539439
xres        : 926.6254
yres        : 926.6254

2. lt.q.test: a ~92 m resolution grid representing identifiers for
unique climate-soil combinations:
nrow        : 1403
ncol        : 1878
ncell       : 2634834
min value   : ?
max value   : ?
projection  : +proj=aea +lat_1=-18 +lat_2=-32 +lat_0=0 +lon_0=24
+x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs
+towgs84=0,0,0
xmin        : 415926.7
xmax        : 590091.6
ymin        : -2670093
ymax        : -2539979
xres        : 92.73961
yres        : 92.73961

3. sum.stats: A dataframe within a list with several variables
relating to outputs from a crop simulations, and one variable
containing values corresponding to the identifiers in yld.test.

My task is to
1. assign crop yield values from sum.stats to lt.q.test
2. resample the result of 1 to the resolution of modtest, so that the
images are lined up, and so that the yield values in yld.test are
averaged/aggregated to modtest's resolution.

The solution I have come up with so far is:

library(raster)

lt.q.test <- raster(paste(path.in, "lt_quin_test1.img", sep = ""))

# Assign values from sum.stats (in y.mn column) to yld.test,
outputting values as integers.
setwd(path.out)
subs(lt.q.test, sum.stats[[1]], by = "VALUE", which = "y.mn",
subsWithNA = TRUE,
                 filename = "yld.rst4.img", format = 'HFA', datatype = "INT2S")

# Resample yld.test to resolution of modtest
yld.test <- raster("yld.rst4.img")
modtest <- raster("MOD_SA_NDVI_amp_albers_test.img")
resample(yld.test, modtest, method = "bilinear", filename =
"yld.resamp.test.img", datatype = "INT2S",
         format = "HFA")  # This seems to work

The above is performed on fairly small test datasets, and I must now
apply this approach multiple times to the full size versions of the
input datasets, where the dimensions of lt.q.test are:

nrow        : 15170
ncol        : 17364
ncell       : 263411880

and modtest's dimensions are:

nrow        : 1651
ncol        : 1937
ncell       : 3197987

I was thus wondering if there is a more efficient approach that I
should adopt, or if there are any other
concerns of which I should be aware?

Thanks in advance for any advice.

Cheers, Lyndon