[R-sig-Geo] stackApply and clusterR
joshgray
joshgray at bu.edu
Wed Oct 2 13:28:48 CEST 2013
I'd like to use /stackApply/ with /clusterR/ to parallelize /RasterStack/
operations. For example, to run some time-series analysis function over the
"z-dimension" of a RasterStack comprised of many large images. I'd
previously accomplished this with the /parallel/ and /rgdal/ packages along
with the /parApply/ function. However, the stacks have gotten a lot bigger
and I can no longer read the entire stack into memory to perform these
operations. So, I'd like to take a /raster/ approach.
/clusterR/ says that it supports most functions that take a /Raster*/ object
as the first argument. Since /stackApply/ conforms to this requirement, I
thought it'd be easy. However, I can't seem to get this to work. My
computing environment is a Linux cluster running Sun Grid Engine.
Here's a toy example that tries to count the number of non-NA values in the
z-dimension of a /RasterStack/. This is just for demo purposes, the actual
operations are more complex.
require(raster)
beginCluster(n=8)
my.f1 <- function(x) length(!is.na(x))
set.seed(42)
x1 <- runif(100)
x2 <- x1
x3 <- x1
x1[sample(1:100, 30)] <- NA
x2[sample(1:100, 30)] <- NA
x3[sample(1:100, 30)] <- NA
r1 <- raster(matrix(x1, nrow=10, ncol=10))
r2 <- raster(matrix(x2, nrow=10, ncol=10))
r3 <- raster(matrix(x3, nrow=10, ncol=10))
s <- stack(r1, r2, r3)
s.out <- clusterR(s, stackApply, args=list(indices=c(1:3), fun=my.f1)) #
fails
s.out <- clusterR(s, fun=my.f1) # fails
endCluster()
The first attempt at creating /s.out/ fails with this message:
> s.out <- clusterR(s, stackApply, args=list(indices=c(1:3), fun=my.f1)) #
> fails
Error in checkForRemoteErrors(lapply(cl, recvResult)) :
6 nodes produced errors; first error: unused argument(s) (na.rm = TRUE)
And then running it again:
> s.out <- clusterR(s, stackApply, args=list(indices=c(1:3), fun=my.f1)) #
> fails
Error in cellFromRowCol(out, tr$row[j], 1) : subscript out of bounds
And once more:
> s.out <- clusterR(s, stackApply, args=list(indices=c(1:3), fun=my.f1)) #
> fails
Error in checkForRemoteErrors(lapply(cl, recvResult)) :
one node produced an error: unused argument(s) (na.rm = TRUE)
Any ideas?
--
View this message in context: http://r-sig-geo.2731867.n2.nabble.com/stackApply-and-clusterR-tp7584760.html
Sent from the R-sig-geo mailing list archive at Nabble.com.
More information about the R-sig-Geo
mailing list