[R-sig-Geo] parallel raster processing with calc and mc2d monte carlo simulation

Robert J. Hijmans r.hijmans at gmail.com
Tue Mar 17 16:14:41 CET 2015


Sean,

fun.CROP_AGWBC  refers to objects that are not defined inside the
function ("lm.final" and "lambda_DV"). I assume that this is
intentional and that these represent constants; and that they are
available in your global environment. If so, you need to export these
objects to the cluster nodes. See the 'export' argument in clusterR.
You also need to load necessary packages before calling beginCluster

Robert

On Mon, Mar 16, 2015 at 1:46 PM, spkearney <sean.kearney at alumni.ubc.ca> wrote:
> Hello all, and thanks in advance for any and all help you can give on this:
>
> I have set up a function to extract the 2.5%, 50% and 97.5% percentiles from
> a monte carlo simulation on three rasters that is to be called up using
> calc() in the raster package and it works great on a test-sized stack/brick,
> thanks to suggestions at this post here:
> http://grokbase.com/t/r/r-sig-geo/123cb3daaq/apply-monte-carlo-simulation-for-each-cell-in-a-matrix-originally-raster
>
> My problem, is that I want to run this function on a much larger Raster
> Brick that, as written, takes hours to process.  I need to do this multiple
> times, so I am trying to speed up the processing using clusterR (or another
> option such as rasterEngine with multi-core processing).  However, I can't
> get it to work!   Here is the code that works on the test raster brick:
>
> brick <- brick(BC_BA, BC_BA_SE, SlopePer)  ## Stack three rasters into one
> Raster Brick
> testbrick <- crop(brick, extent(299700, 300100, 1553550, 1553650)) ## Crop
> brick to manageable size
>
> ndunc(101)
> fun.CROP_AGWBC <- function(x) {
>   dBC_BA <- mcdata(x[[1]], type="0")
>   dBC_BA_SE <- mcdata(x[[2]], type = "0")
>   SlopePer <-x[[3]]
>   stBA <- mcstoc(rnorm, type = "U", rtrunc = TRUE,
>                  mean = dBC_BA, sd = dBC_BA_SE, linf = 0, lhs = FALSE)
>   BC_AGWBC <- lm.final$coefficients[1] +
>     lm.final$coefficients[2]*stBA +
>     lm.final$coefficients[3]*SlopePer
>   AGWBC <- (lambda_DV * BC_AGWBC + 1)^(1/lambda_DV)-1
>   quantile(AGWBC[], c(0.025, 0.5, 0.975), na.rm=TRUE)
> }
>
> CROP_AGWBC <- calc(teststack, fun.CROP_AGWBC) ##Run the calc function
> CROP_AGWBC ##Check the result
> plot(CROP_AGWBC) ##Plot the three-raster brick result
>
> ##Extract the individual raster layers
> CROP_AGWBC_PRED <- CROP_AGWBC[[2]]
> CROP_AGWBC_LWR <- CROP_AGWBC[[1]]
> CROP_AGWBC_UPR <- CROP_AGWBC[[3]]
>
> As I mentioned, the code works great on a small sample.  I tried to speed it
> up using clusterR as follows, first testing it on the 'testbrick' Raster
> Brick with hopes to use it on the whole Raster Brick:
>
> beginCluster(8)
> clusterR(x = testbrick, fun = fun.CROP_AGWBC)
>
> and I get the following error:
> [1] "data should be numeric or logical"
> attr(,"class")
> [1] "snow-try-error" "try-error"
> Error in clusterR(x = testbrick, fun = fun.CROP_AGWBC) : cluster error
>
> It is interesting because, if I try to run it again, I get this error
> instead:
> Error in as.vector((x[, 1] - 1) * ncol(object) + x[, 2]) :
>   error in evaluating the argument 'x' in selecting a method for function
> 'as.vector': Error in x[, 2] : subscript out of bounds
>
> I have tried this many different ways, including along the lines of:
> f <- function (x) calc(x, fun.CROP_AGWBC)
> y <- clusterR(testbrick, f)
>
> which gives me the same error (more or less) of:
> Error in checkForRemoteErrors(lapply(cl, recvResult)) :
>   2 nodes produced errors; first error: data should be numeric or logical
>
> And I have tried using the rasterEngine() function (first without parallel
> processing) by changing up the code in two ways, the first being:
> ndunc(101)
> fun.CROP_AGWBC <- function(x) {
>   dBC_BA <- mcdata(x[[1]], type="0")
>   dBC_BA_SE <- mcdata(x[[2]], type = "0")
>   SlopePer <-x[[3]]
>   stBA <- mcstoc(rnorm, type = "U", rtrunc = TRUE,
>                  mean = dBC_BA, sd = dBC_BA_SE, linf = 0, lhs = FALSE)
>   BC_AGWBC <- lm.final$coefficients[1] +
>     lm.final$coefficients[2]*stBA +
>     lm.final$coefficients[3]*SlopePer
>   AGWBC <- (lambda_DV * BC_AGWBC + 1)^(1/lambda_DV)-1
>   output <- quantile(AGWBC[], c(0.025, 0.5, 0.975), na.rm=TRUE)
>   output_array <- array(output,dim=c(dim(x)[1],dim(x)[2],3))
>   return(output_array)
> }
> re <- rasterEngine(x = testbrick, fun = fun.CROP_AGWBC)
>
> which runs but gives me a 3-layer Raster Brick all with NA's or Inf.  The
> second thing I tried used the same fun.CROP_AGWBC function as above, but
> with the following rasterEngine code to call up the calc formula:
> f <- function(x) {reout <- calc(x, fun.CROP_AGWBC)
>                   reout_array <- array(getValues(reout),
> dim=c(dim(x)[1],dim(x)[2],3))
>                   return(reout_array)
> }
> re <- rasterEngine(x = testbrick, fun = f, chunk_format = "raster")
>
> which gives me the following error, even though I thought I converted the
> output to an array:
> chunk processing units require array vector outputs.  Please check your
> function.
> Error in focal_hpc_test(x, fun, window_center, window_dims, args,
> layer_names,  :
>
> So, my questions are as follows:
> *1) Does anyone know why the clusterR does not work for the calc() function
> in my first attempt?  I imagine it has something to do with the conversion
> of rasters to mcnodes in the function, but can't figure it out!  Any
> suggestions?
>
> 2) Any thoughts on why I can't get this to work with the rasterEngine()
> function?  I am converting the outputs to arrays with the same dimensions as
> the input file, but still no luck.*
>
> Again, any help is much appreciated.  Any suggestions for improving this
> question are welcome and I'll do my best to update it - this is my first
> post!
>
> Kind Regards,
> sean
>
>
>
>
>
> --
> View this message in context: http://r-sig-geo.2731867.n2.nabble.com/parallel-raster-processing-with-calc-and-mc2d-monte-carlo-simulation-tp7587901.html
> Sent from the R-sig-geo mailing list archive at Nabble.com.
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo



More information about the R-sig-Geo mailing list