[R-sig-Geo] Raster in parallel computing?

Matteo Mattiuzzi matteo.mattiuzzi at boku.ac.at
Wed Jan 8 15:01:00 CET 2014


Camillo,
I think your example is missing important information, especially the context of this processing step, that matters a lot for the solution to choose.
based on you example I could suggest you to try something like that:


library(raster)
#creates 3 test rasters
a <- raster(nrow=3, ncol=3)
a[] <- 1:9


b <- raster(nrow=3, ncol=3)
b[] <- 10:18


c <- raster(nrow=3, ncol=3)
c[] <- 19:27


#concatenates the rasters
# d<-brick(a,b,c) # I have removed this as it is not meaningful for the testrun
infiles <- list(a,b,c)


#creates a raster at a different resolution
s <- raster(nrow=10, ncol=10)
#####


beginCluster() # see help
cl<-getCluster() 
exportCluster(cl,"s") # export to cluster the target raster 's', so it is available to resample fun.
res <- parLapply(cl,infiles,fun=function(x){resample(x,s)}) # I can imagine you need the filename argument in ?resample, this could be added here (in fun) 


I'm not sure what happenes having 1million of files open, using the 'filename' argument and overwriting the result of xx<-resample(x,s) maybe this could be avoided (here a idea that should work if files are on disk filename(a)!="" and not created as in your example):
 
#not tested!
parfun <- function(x){
inname <- filename(x)
outname <- gsub(inname,pattern=paste0(extension(inname),"$"),replacement=paste0("_resampled",extension(inname),"$")) # 
xx<- resample(x,s,filename=outname, method='bilinear')
return(NULL)}


res <- parLapply(cl,infiles,fun=parfun) # result is on disk and not in your R session (res).



Matteo


>>> Camilo Mora <cmora at dal.ca> 01/08/14 8:40 AM >>>
Hi everyone,

I am using the package "raster" to interpolate a large number of  
rasters (~1million) of different resolutions to a unique 1degree  
resolution grid and wonder if you know if it is possible to do this in  
parallel computer?.

My code (example below) works like a charm but it will take 30 days to  
process all the rasters. Sadly, the process only uses one core of my  
computer. I wonder if there is a way to run this code (example below)  
in parallel computer?.

Thanks,

Camilo

####TEST CODE######
library (raster)

#creates 3 test rasters
a <- raster(nrow=3, ncol=3)
a[] <- 1:9

b <- raster(nrow=3, ncol=3)
b[] <- 10:18

c <- raster(nrow=3, ncol=3)
c[] <- 19:27

#concatenates the rasters
d<-brick(a,b,c)

#creates a raster at a different resolution
s <- raster(nrow=10, ncol=10)

#interpolates data from the brick to the new resolution
s <- resample(d, s, method='bilinear')

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo



More information about the R-sig-Geo mailing list