[R] RE : RE : R getting slower until it breaks...

jim holtman jholtman at gmail.com
Thu Oct 7 22:03:39 CEST 2010


perfmon under windows will give you the amount of I/O that a process
is doing.  You can see if the number if I/O Reads and I/O Writes is
increasing for your process.  I don't know what the internals of the
functions you are calling are doing w.r.t. how the files are accessed.

On Thu, Oct 7, 2010 at 1:18 PM, Bastien Ferland-Raymond
<bastien.ferland-raymond.1 at ulaval.ca> wrote:
> Actually, I rechecked the paging and thought about it more, and realized the windows was French and therefore, the "," was the decimal separator and not the "thousand" separator.  So the paging was staying around 6 (not 6000) most of the time and hitting the 100 other times. It looks a little bit more normal.
>
> About the I/O.  I'm really not familiar with this concept.  I really want to answer your questions and test all of that, but I don't know how.  Can you check the I/O in R, or are you suppose to do it in Windows?  What function/program should I use?
>
>
> Thanks
> Bastien
>
>
> ________________________________________
> De : jim holtman [jholtman at gmail.com]
> Date d'envoi : 6 octobre 2010 20:09
> À : Bastien Ferland-Raymond
> Cc : r-help at R-project.org
> Objet : Re: RE : [R] R getting slower until it breaks...
>
> 6000 pages/sec sounds very high, but if your CPU utilization is
> decreasing over time, this is one of the causes.  The system is paging
> memory out and having to wait for I/O to complete and therefore is not
> using the CPU.  What other I/O is your system doing?  When you are
> partitioning the image, are you doing a lot of I/O to write it out?
> So I would assume that you are somehow blocking on I/O and that is why
> your utilization (and throughput) is decreasing.  You might want to
> look at what your application is doing in this area.  Paging is also
> contributed to reading in some data.
>
> When are are breaking apart the data, are you starting the read from
> the front of the file each time, or picking up where you left off?  So
> you might want to take a closer look around how you are doing your
> I/O.
>
> On Wed, Oct 6, 2010 at 3:06 PM, Bastien Ferland-Raymond
> <bastien.ferland-raymond.1 at ulaval.ca> wrote:
>> Thanks a lot for your quick answer.  Here is my answer to your questions:
>>
>> Have you looked to see how fast your memory might be growing?
>> BFR- Yes I did, it's not to bad, it's starts around 60 000ko, rise up to 120 000 at the most, so not too scary.
>>
>> Are you leaving around any large objects that should be removed?
>> BFR- I was carefull making sure the function doesn't create anything that would be visible with objects().  Could it be creating other type (hidden) objects?  Maybe, but I'm not very familliar with that stuff.
>>
>> Have you looked to see if you are paging?
>> BFR- I just red the wiki about paging, didn't know that term before.  If I look at perfmon, its looks like keeping steady at 6000 pages/s with rare peaks as high as 900 000. Does it sounds normal?  How can it affect R?
>>
>> Is it your CPU time that is increasing, or your wall clock time?
>> BFR- If I go to the task manager - performance.  R is initially using around 40% of the processor (so around 80% of 1 core) but with (real) time passing, it gets lower and lower to get as low as 6% (12% of one core).  I was surprized to see that as usually my simulation in R use one whole core.
>>
>> It sounds like there might be some memory leak that might be causing your process size to grow and possibly causing paging.  You will need to gather some of the performance data that perfmon can provide and look at the memory usage, CPU time and I/O rates over time to see if there are any changes.
>> BFR- The term "Memory leak" feels right with my problem.  Is there ways I can control/detect/prevent this kind of problem in R.  Also, how can I check the I/O, i never looked at that before.
>>
>> Thanks again
>>
>> Bastien
>>
>>
>>
>> On Wed, Oct 6, 2010 at 2:11 PM, Bastien Ferland-Raymond
>> <bastien.ferland-raymond.1 at ulaval.ca> wrote:
>>> Hello R-users,
>>>
>>> I'm currently facing a pretty hard problem which I'm hopping you'll be able to help me with.  I'm using R to create images.  That alone is not the problem, the problem is that I'm using R to create 168 000 images...  My code (which is given below) use different package (raster and rgdal) to import a image (size 20gig) and divide it into 168 000 pictures that are 100 pixel x 100 pixel.  The code works fine for making the images, but if I ask it to run all 168 000, it always breaks around 15 000.
>>>
>>> It starts with the code being able to make around 2 pictures per second, but then it slows down and after around 2000 pictures it's only 1 picture per second.  Later on it's getting closer to 1 pictures every 3 seconds etc.  until it bugs.  I have no error message, only Windows that tells me that "R encounter a problem and most be close..."  Initially I though it was a Windows problem, that I couldn't put too many file into a folder and it was slowering it down.  Then I divided my batch process into smaller (5000 files) folder but it didn't help, still breaks at 15 000.  I also try to do gc() after each 5000 pictures to save memory but it didn't help either.  I removed every loops from the code because I thought it was the problem, but it was just faster at bugging... After the bug, I need to restart the computer if I want to go back to the initial speed.
>>>
>>> I'm pretty much running out of options.  It's there limitation in R as the number of files it can create in one session?  Is it a windows problem?  Is there better way to clear the memory than gc()? Any thought on that?
>>>
>>> I'm using R 2.11.1, win XP, my hard drive is NTSF, computer: intel core2 duo E6750 32 bit with 2 gig of Ram.
>>>
>>> Here is my code, but I doubt it would help much with my problem:
>>>
>>> ########
>>> # It made of 4 functions (sorry, it's french):
>>>
>>> ##########################################################################
>>> ##########################################################################
>>> ###  Ensemble des fonctions pour faire les images NDVI rouge et verte  ###
>>> ##########################################################################
>>> ######  Bastien Ferland-Raymond, 5 oct 2010  #############################
>>> ##########################################################################
>>>
>>> ########
>>> ## Simplement rouler le script au complet
>>> ########
>>> ### Library nécessaire:
>>> library(raster)
>>> library(rgdal)
>>> library(shapefiles)
>>>
>>> #############################################################################
>>> ## Fonction 1  -  NDVI a partir de coordonnee Pixel et largeur #####
>>>  calculate_NDVI<- function(Type, object, VALUE) {
>>>   redorgreen <- ifelse(Type=="red",2,3)
>>>   list1 <- unstack(object)
>>>   rast1 <- list1[[1]]
>>>   rast2 <- list1[[redorgreen]]
>>>   NAvalue(rast1)<- -99999
>>>   NAvalue(rast2)<- -99999
>>>   cells1 <- getValuesBlock(rast1,row=VALUE[[2]],nrow=VALUE[[3]],col=VALUE[[1]],ncol=VALUE[[3]])
>>>   cells2 <- getValuesBlock(rast2,row=VALUE[[2]],nrow=VALUE[[3]],col=VALUE[[1]],ncol=VALUE[[3]])
>>>   cells1[is.na(cells1)]<-0;
>>>   cells2[is.na(cells2)]<-0;
>>>   calculNDVI <-(cells1 - cells2) / (cells1 + cells2)
>>>   NDVImatrix <- matrix(calculNDVI,nrow=VALUE[[3]],ncol=VALUE[[3]], byrow=TRUE)
>>>   NDVImatrix <- NDVImatrix + 1
>>>   NDVImatrix <- NDVImatrix * (255/2)
>>>   return(NDVImatrix)
>>>   }
>>> #################################################################################
>>> ## Fonction 1b  -  Faire le tiff
>>>  make.tiff<- function(NV=newValues,TT=Type,img=imgRaster,nom){
>>>  pixelNDVIMatrix <- calculate_NDVI(TT,img,NV[c(1,2,3)])
>>>  newRaster <- raster(pixelNDVIMatrix)
>>>  NAvalue(newRaster)<-999999
>>>  nnom<-nom[NV[4]]
>>>  writeRaster(newRaster, filename=nnom,datatype="INT1U",format="GTiff",overwrite=FALSE)
>>>  aaa<-2
>>> }
>>>
>>> #################################################################################
>>> ## Fonction 2  -  Creation de fonction convertissant les coordonnee metrique en coordonnee pixels #####
>>>  latlong_to_pixels<- function(Coord, facteur, meterWidth=NULL) {   #Coord doit être c(x,y)
>>>  newX <- Coord[1] / facteur
>>>  newY <- Coord[2] / facteur
>>>  if(!is.null(meterWidth)){
>>>   newWidth <- meterWidth / facteur
>>>   return(c(newX,newY,newWidth))
>>>  }
>>>  return(c(newX,newY))
>>>  }
>>>
>>> #############################################################################
>>> ####  Fonction 3  -  Fonction principal   #####
>>>  make.NDVI.photo<- function(tableDesPlacettes,Type,newImagesDirectory,textAndImgDirectory="U:\\kNN_Valcartier\\Photo aerienne"){
>>>  lastWD<- getwd()
>>> setwd(textAndImgDirectory)
>>>  imgRaster<- stack(imageAssociee)
>>>  x1<- tableDesPlacettes[,2] - xmin(imgRaster) - (tailleFenetres/2)     # The image origin for calculation is in the top left corner
>>>  y1<- ymax(imgRaster) - tableDesPlacettes[,3] - (tailleFenetres/2)
>>> coo <- cbind(x1,y1)
>>>  newValues<- t(apply(coo,1,latlong_to_pixels,facteurMetreParPixel,tailleFenetres))
>>>  newImgName<- paste(newImagesDirectory,substr(Type,1,3),"_","GC",tableDesPlacettes[,1],".tif",sep="")
>>> apply(cbind(newValues,1:length(newImgName)),1,make.tiff,Type,imgRaster,nom=newImgName)
>>> setwd(lastWD)
>>> }
>>>
>>>
>>> ###########################
>>> ## Executing fonctions:
>>> #############################
>>>
>>>
>>> ###  loader les données brutes de fenêtres
>>> fichier.fenetre.brute<-read.dbf("U:\\kNN_Valcartier\\Fenêtres 30x30 1 octobre\\168700_30m_centroid.dbf", header=T)
>>> ###  Sélectionner les fenêtres complètes
>>> fenetre.complete<-round(fichier.fenetre.brute[[1]][,2],1)==900
>>> ###  Sortir les centroides pour extraction
>>> centro.tout.900<-fichier.fenetre.brute[[1]][fenetre.complete,c(1,5,6)]
>>> #rm(fichier.fenetre.brute) ; gc()
>>>
>>> ## données nécessaires pour la fonction
>>>  imageAssociee<- "mosaique_all_v1.img"  # nom de l'image
>>>  facteurMetreParPixel<- 0.3         # combien de metre vaut un pixel
>>>  tailleFenetres= 30            # en metre
>>>
>>>  start.time<-Sys.time();start.time
>>> make.NDVI.photo(centro.tout.900[19137:24136,],"red","BFR\\NDVI_red_fenetre\\batch 3\\")
>>>  stop.time<-Sys.time()
>>>  time.run<-stop.time-start.time
>>>  alarm()
>>>  time.run
>>> gc()
>>>  start.time<-Sys.time();start.time
>>> make.NDVI.photo(centro.tout.900[24137:29136,],"red","BFR\\NDVI_red_fenetre\\batch 4\\")
>>>  stop.time<-Sys.time()
>>>  time.run<-stop.time-start.time
>>>  alarm()
>>>  time.run
>>> gc()
>>>
>>> #############
>>>
>>>
>>> Voilà,
>>>
>>> Thanks!
>>>
>>> Bastien Ferland-Raymond
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem that you are trying to solve?
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem that you are trying to solve?
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



More information about the R-help mailing list