[R-sig-Geo] Speeding up foreach loop for local neighborhood operation on a RasterStack

stumpf andré andre.stumpf at univ-brest.fr
Tue Mar 31 20:08:48 CEST 2015


Dear list members,

I'm trying to implement an analysis that applies PCA on local
neighborhoods (e.g. 5x5 pixel)
over a large raster stack. To speed up processing I'm using foreach but
run into some performance
issues. Please find below an example that gives the general idea. (Sorry
for the long example...questions at the end)

library(raster)
library(doParallel)

#create a raster
rast <- raster(ncol=100, nrow=100)
rast[] <- runif(100*100)
crs(rast) <- "+proj=utm +zone=32 +datum=WGS84 +units=m +no_defs
+ellps=WGS84 +towgs84=0,0,0"
bb <- extent(310883.5, 310883.5+50, 4918798, 4918798+50)
extent(rast) <- bb

#create two stacks
stack.rast1 <- rast
while (nlayers(stack.rast1)<=20) {
  tmp.rast <- rast-2*runif(1)
  stack.rast1 <- addLayer(stack.rast1,tmp.rast)
}

stack.rast2 <- rast
while (nlayers(stack.rast2)<=20) {
  tmp.rast <- rast-2*runif(1)
  stack.rast2 <- addLayer(stack.rast2,tmp.rast)
}

# only pixel exceeding a certain threshold in the first layer should be
analyzed
thresh <- 0.5
thresh.rast <- rast > thresh
freq(thresh.rast)

# get indices of 1's
cells.ind <- Which(rast > thresh, cells=TRUE)
pixel.ind <- rowColFromCell(thresh.rast, cells.ind)
nrows <- 5
ncols <- 5

# wrap the borders with zeros
stack.rast1 <- extend(stack.rast1, c(nrows,ncols), value=0)
stack.rast2 <- extend(stack.rast2, c(nrows,ncols), value=0)
pixel.ind <- pixel.ind + nrows

# initialize cluster and loop over all indexed pixel
cl <- makeCluster(8)
registerDoParallel(cl)
out.test <- NULL

out.test <-  foreach (i=1:nrow(pixel.ind), .packages='raster',
.combine='c') %dopar% {
  j <- pixel.ind[i,1]
  k <- pixel.ind[i,2]
  tmp.x <- as.vector(getValuesBlock(stack.rast1, row=j, nrows=5, col=k,
ncols=5, lyrs=1:20))
  tmp.y <- as.vector(getValuesBlock(stack.rast2, row=j, nrows=5,
col=k,ncols=5, lyrs=1:20))
  r <- prcomp(cbind(tmp.x, tmp.y))
  (r$sdev[1]^2 / sum(r$sdev^2))
}

stopCluster(cl)

# write
writeRaster(stack.rast1,file="~/RasterStack1.tif",
format="GTiff",overwrite=TRUE)
writeRaster(stack.rast2,file="~/RasterStack2.tif",
format="GTiff",overwrite=TRUE)

# read back and run the loop
stack.rast1 <- stack("~/RasterStack1.tif")
stack.rast2 <- stack("~/RasterStack2.tif")
# the loop now takes forever

# so I ran a simple operation to read the raster back in memory
stack.rast1 <- stack.rast1*1
stack.rast2 <- stack.rast2*1
# now the loops executes as fast as when the stack was original created

Questions:
1. Getting the values out of the raster stack takes most of the time:
Can anyone think of a faster way to access the local values in the stack?

system.time( r <- prcomp(cbind(tmp.x, tmp.y)))
user  system elapsed
0.000   0.000   0.001

system.time(tmp.x <- as.vector(getValuesBlock(stack.rast1,
row=j,nrows=5, col=k, ncols=5, lyrs=1:20)))
user  system elapsed
0.003   0.000   0.002

2. Running an algebraic operation to get the raster back into memory is
maybe a bit clumsy. Is that the right way to do it?

Kind regards, and many thanks in advance for any hints,
André

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20150331/151d689d/attachment.bin>


More information about the R-sig-Geo mailing list