[R-sig-Geo] stackApply and clusterR

joshgray joshgray at bu.edu
Wed Oct 2 13:28:48 CEST 2013


I'd like to use /stackApply/ with /clusterR/ to parallelize /RasterStack/
operations. For example, to run some time-series analysis function over the
"z-dimension" of a RasterStack comprised of many large images. I'd
previously accomplished this with the /parallel/ and /rgdal/ packages along
with the /parApply/ function. However, the stacks have gotten a lot bigger
and I can no longer read the entire stack into memory to perform these
operations. So, I'd like to take a /raster/ approach.

/clusterR/ says that it supports most functions that take a /Raster*/ object
as the first argument. Since /stackApply/ conforms to this requirement, I
thought it'd be easy. However, I can't seem to get this to work. My
computing environment is a Linux cluster running Sun Grid Engine.

Here's a toy example that tries to count the number of non-NA values in the
z-dimension of a /RasterStack/. This is just for demo purposes, the actual
operations are more complex.

require(raster)
beginCluster(n=8)

my.f1 <- function(x) length(!is.na(x))

set.seed(42)
x1 <- runif(100)
x2 <- x1
x3 <- x1
x1[sample(1:100, 30)] <- NA
x2[sample(1:100, 30)] <- NA
x3[sample(1:100, 30)] <- NA
r1 <- raster(matrix(x1, nrow=10, ncol=10))
r2 <- raster(matrix(x2, nrow=10, ncol=10))
r3 <- raster(matrix(x3, nrow=10, ncol=10))
s <- stack(r1, r2, r3)

s.out <- clusterR(s, stackApply, args=list(indices=c(1:3), fun=my.f1)) #
fails
s.out <- clusterR(s, fun=my.f1) # fails

endCluster()


The first attempt at creating /s.out/ fails with this message:
> s.out <- clusterR(s, stackApply, args=list(indices=c(1:3), fun=my.f1)) #
> fails
Error in checkForRemoteErrors(lapply(cl, recvResult)) : 
  6 nodes produced errors; first error: unused argument(s) (na.rm = TRUE)

And then running it again:
> s.out <- clusterR(s, stackApply, args=list(indices=c(1:3), fun=my.f1)) #
> fails
Error in cellFromRowCol(out, tr$row[j], 1) : subscript out of bounds

And once more:
> s.out <- clusterR(s, stackApply, args=list(indices=c(1:3), fun=my.f1)) #
> fails
Error in checkForRemoteErrors(lapply(cl, recvResult)) : 
  one node produced an error: unused argument(s) (na.rm = TRUE)

Any ideas?




--
View this message in context: http://r-sig-geo.2731867.n2.nabble.com/stackApply-and-clusterR-tp7584760.html
Sent from the R-sig-geo mailing list archive at Nabble.com.



More information about the R-sig-Geo mailing list