[R-sig-Geo] Parallel processing with extract()/randomForest() in VM

ASANTOS @|ex@ndre@@nto@br @end|ng |rom y@hoo@com@br
Tue May 28 03:47:44 CEST 2019


Dear R-Sig-Geo Members,

 ?????? I create a virtual machine (VM) in Google Cloud with Ubuntu 18.04 
with 8 CPU and 30 RAM memory and R 3.6.0 version, but I try to improve 
my spatial analysis without success or same a more faster process. If I 
use packages snow and doMC with all the 8 CPU's in an operation, but it 
use in only 12,54% of our capacity, when the objective is user 
extraction() in raster and RF with randomForest(). The gain of 18 
seconds, I think that is not so good, then my question is there are any 
way for improve that? In my test, I make:

# Take in the ubuntu terminal the number of processors
foresteyebrazil using superforettech1:~$cat/proc/cpuinfo|grepprocess|wc-l
8
#Packages
library(raster)
library(snow)
library(doMC)
library(randomForest)
registerDoMC()
#Take a raster for worldclim
r<-getData('worldclim', var='alt', res=5)
# 1) Use extract()/ randomForest() in Virtual Machine 
----------------------------
start_time<-Sys.time()
# SpatialPolygons
cds1<-rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20))
cds2<-rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
polys<-spPolygons(cds1, cds2)
# Extract
exr<-raster::extract(r, polys)
tr<-ifelse(exr[[2]]<10,c("A"),c("B"))
df<-cbind(tr,exr[[2]], sqrt(exr[[2]]))
df2<-data.frame(as.factor(df[,1]),as.numeric(as.character(df[,2])),as.numeric(as.character(df[,3])))
df2<-df2[complete.cases(df2),]
colnames(df2)<-c("res1","var1","var2")
res<-NULL
for(win1:9){
mod_RF<-randomForest(x=cbind(df2$var1,df2$var2), y=df2$res1, ntree=100, 
mtry=2)
res=rbind(res,cbind(w,mean(mod_RF$err.rate[,1])*100))
}
end_time<-Sys.time()
end_time-start_time
#
#Time difference of 38.72528 secs
# 2) Use extract() with snow and doMC packages in Virtual Machine 
----------------------------
start_time<-Sys.time()
# SpatialPolygons
cds1<-rbind(c(-180,-20), c(-160,5), c(-60, 0), c(-160,-60), c(-180,-20))
cds2<-rbind(c(80,0), c(100,60), c(120,0), c(120,-55), c(80,0))
polys<-spPolygons(cds1, cds2)
# Extract
beginCluster(n=8)
exr<-raster::extract(r, polys)
tr<-ifelse(exr[[2]]<10,c("A"),c("B"))
df<-cbind(tr,exr[[2]], sqrt(exr[[2]]))
df2<-data.frame(as.factor(df[,1]),as.numeric(as.character(df[,2])),as.numeric(as.character(df[,3])))
df2<-df2[complete.cases(df2),]
colnames(df2)<-c("res1","var1","var2")
endCluster()
res<-NULL
mod_RF2<-foreach(1:9) %dopar%{
randomForest(x=cbind(df2$var1,df2$var2), y=df2$res1, ntree=100, mtry=2)
}
res=rbind(res,cbind(mean(mod_RF2$err.rate[,1])*100))
}
end_time<-Sys.time()
end_time-start_time
#
#Time difference of 20.57027 secs

Thanks in advanced,

-- 
======================================================================
Alexandre dos Santos
Prote????o Florestal
IFMT - Instituto Federal de Educa????o, Ci??ncia e Tecnologia de Mato Grosso
Campus C??ceres
Caixa Postal 244
Avenida dos Ramires, s/n
Bairro: Distrito Industrial
C??ceres - MT                      CEP: 78.200-000
Fone: (+55) 65 99686-6970 (VIVO) (+55) 65 3221-2674 (FIXO)

         alexandre.santos using cas.ifmt.edu.br  
Lattes:http://lattes.cnpq.br/1360403201088680
OrcID: orcid.org/0000-0001-8232-6722
Researchgate:www.researchgate.net/profile/Alexandre_Santos10                        
LinkedIn: br.linkedin.com/in/alexandre-dos-santos-87961635
Mendeley:www.mendeley.com/profiles/alexandre-dos-santos6/
======================================================================






	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list