[R-sig-Geo] loop memory problem

marta azores martazores at gmail.com
Tue Mar 14 18:02:23 CET 2017


> #answer Marcel in r-sig-geo forum:
>> #1) maybe you can first make your calculations and afterwards plot the
>> result.
>> #It might be faster this way.
>>
>> #2) maybe you can use the parallel for each loop in order to use the
>> whole performance of your cpu
>>
>> #3) loops in R are really slow in general.. you could also think about
>> some fancy stuff like some R compiler or Rcpp package
>>
>> #4) in order to analyze your memory consumption, you should have a look
>> on your system resources (depending on your OS: task manager (win) or htop
>> (linux))
>>
>> #############
>> #1)MARCEL: maybe you can first make your calculations and afterwards plot
>> the result.
>> #It might be faster this way.
>> ##MARTA: I did that, but the difference between the calculations inside
>> the loop of afterwards is less than a second.
>> #
>>
>       #############

> #2) MARCEL:maybe you can use the parallel or each loop in order to use the
>> whole performance of your cpu
>> #MARTA:I  developed 5 different loops, the old one (2.A) it works
>> properly but is too slow. I followed your suggestion with the parallel
>> loops to increase the speed. I wrote a loop (2.B) with %do%, which is works
>> but the time is the #same as the old loop. The other loops with %doPar%,
>> didn't work at all. Only the simple (2.D) runs, but is not what I want. The
>> other two ( 2.C and 2.E) didn't run, they have errors. The 2.C "task 1
>> failed - "non-numeric argument #to binary operator"  and the 2.E "task 1
>> failed - "no method for coercing this S4 class to a vector".
>> #
>> #Any idea of how repair this loop?
>>
>> #########
>> #library
>> ########
>> library(parallel);library(foreach);library(doParallel);
>> library(gdistance);library(raster)
>> library(rgdal);library(rgeos)
>> library(sp)
>> #######
>> #data
>> ###########
>> path_data<-"E:/Q11/"
>> # cost raster sea and island
>> costa6Azo <- raster("E:/Q11/costa6Azo_projected.tif")
>> transitioncosta6Azo <- transition(costa6Azo, min, directions=16)#porque
>> min????
>> trCostS16 <- gdistance::geoCorrection(transitioncosta6Azo, type="c")
>> #points
>> boat <- read.table(paste0(path_data,"boat2905.csv"), header=TRUE,
>> sep=";", na.strings="NA", dec=".", strip.white=TRUE)
>> pos<-as.data.frame(cbind(boat$Lat1,boat$Long1,boat$Ref))
>> str(pos)
>> sp::coordinates(pos) <- ~V2+V1
>> sp::proj4string(pos) <-CRS("+proj=longlat +datum=WGS84 +no_defs")
>>
>> pos<-sp::spTransform(pos, CRS("+proj=utm +zone=26 +ellps=intl
>> +towgs84=-104,167,-38,0,0,0,0 +units=m +no_defs"))
>> # The aim of the loop: calculating the track of the whole sailing
>> path#############################################################
>> #
>> line<- gdistance::shortestPath(trCostS16, pos at coords[1,],pos at coords[2,],
>> output="SpatialLines")#
>> glength(1-2=4887.737m);(2-3=12590.01);(12-11=11360.39m);(12-13=9453.001m)
>> lines(line,col=5)
>> #2. A) old
>> loop##########################################################################################################################
>> #it works, but it's too slow
>>
>

> ## here we start with the for-loop
>> for (i in (seq(2,length(pos) - 1))) {
>>   # calculation of the rest of the segements
>>   nextSegment<- gdistance::shortestPath(trCostS16, pos at coords
>> [i,],pos at coords[i+1,], output="SpatialLines")
>>   # simple addition combines the single spatialline segements
>>   line <- nextSegment + line
>>   # we plot each new segment
>>   lines(nextSegment)
>> }
>> # note that we have now ten combined line features in this SpatialLines
>> object
>> line
>> gLength(line)#110747.2
>>
>> #2.B### new parallel
>> loop##################################################################################################3
>> #it works, but it's too slow##%do%
>> #
>> line<- gdistance::shortestPath(trCostS16, pos at coords[1,],pos at coords[2,],
>> output="SpatialLines")#
>> glength(1-2=4887.737m);(2-3=12590.01);(12-11=11360.39m);(12-13=9453.001m)
>>
>> x <- foreach(i=2:13) %do%
>>   {nextSegment<-gdistance::shortestPath(trCostS16, pos at coords
>> [i,],pos at coords[i-1,], output="SpatialLines")
>>     line <- nextSegment + line
>>  }
>> x      ##  it works!!!! 1+ 12 spatialLines!!
>> ##
>
>
#     #2.C#parallel loop ##%dopar%

> #without success
>>
> registerDoParallel()
>
> getDoParWorkers()
>> line<- gdistance::shortestPath(trCostS16, pos at coords[1,],pos at coords[2,],
>> output="SpatialLines")#
>> glength(1-2=4887.737m);(2-3=12590.01);(12-11=11360.39m);(12-13=9453.001m)
>> #function
>> #
>> funMTM<-function(){
>> nextSegment<-gdistance::shortestPath(trCostS16, pos at coords[i,],pos at coords[i-1,],
>> output="SpatialLines")
>> line <- nextSegment + line
>> }
>> getDoParWorkers()
>> ptime <- system.time({
>>   result <- foreach(i=2:13) %dopar% funMTM()
>>   })
>> ptime#Error in funMTM() :
>> #task 1 failed - "non-numeric argument to binary operator"
>> #
>>
> #

> #2.D#loop %dopar% simple
>>
>        # it works, but I need the spatiallines output, not a list.

> #If I run the %dopar% only with the functions shortestPath without
>> increase the lines into the SpatialLines. I get a list with 13 features(
>> indidivual spatialLines). However, I need an SpatialLines object, with 13
>> spatiallines inside.
>> registerDoParallel()
>> registerDoSEQ()
>> registerDoParallel(cores=10)
>> getDoParWorkers()
>> system.time(foreach(i=2:13) %dopar% gdistance::shortestPath(trCostS16,
>> pos at coords[i,],pos at coords[i-1,], output="SpatialLines"))
>> #user  system elapsed
>> #0.74    0.69   54.81
>>
>
#    #2.E#loop %dopar% with the function defined in the loop
      #without success:

> system.time(foreach(i=2:13) %dopar% {
>>   nextSegment=gdistance::shortestPath(trCostS16, pos at coords
>> [i,],pos at coords[i-1,], output="SpatialLines")
>>             line =nextSegment + line})
>> #Error in { :
>> #task 1 failed - "no method for coercing this S4 class to a vector"
>> stopCluster(cl)
>>
>> ##############
>> #3)MARCEL: loops in R are really slow in general.. you could also think
>> about some fancy stuff like some R compiler or Rcpp package
>> #MARTA##### Functions #####
>> # byte code compilation
>> library(compiler)
>> myfunc<-gdistance::shortestPath(trCostS16, pos at coords[1,],pos at coords[2,],
>> output="SpatialLines")
>> myFuncCmp <- cmpfun(myfunc)
>> system.time({
>>   output <- SpatialLines(LinesList = , 1, FUN=myFuncCmp)
>> })
>> #############
>>
>>       #############

> #4) MARCEL: in order to analyze your memory consumption, you should have a
>> look on your system resources (depending on your OS: task manager (win) or
>> htop (linux))
>> #MARTA:I run the loop with the task manager's window open, and never over
>> pass the 30% of the CPU's memory .
>> ####files
>> https://drive.google.com/open?id=0BwqSBe1Yq-FBUWVBOUdvaThEU1k
>>
>>
> I haven't my solution yet but I'm closer now, your suggestions were really
helpful.
Marta

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list