[R] speed issues? read R_inferno by Patrick Burns: & a memory query

maddox matthewgdodds at hotmail.com
Thu Dec 23 14:13:24 CET 2010


Hi,

I'm just starting out with R and came across R_inferno.pdf by Patrick Burns
just yesterday - I recommend it!

His description of how 'growing' objects (e.g. obj <- c(obj,
additionalValue) eats up memory prompted me to rewrite a function (which
made such calls ~210 times) so that it used indexing into a dimensioned
object instead (i.e. obj[i, ] <- additionalValue).

This transformed the process from 
old version:
   user  system elapsed 
133.436  14.257 155.807 

new version:
   user  system elapsed 
 16.041   1.180  18.535 

To say I'm delighted is understatement. Thanks for putting the Inferno
together,  Patrick.

However I'm misunderstanding the effect this has on memory use, (or
misunderstanding the code I've highjacked to look at memory use). To look at
virtual memory use I'm using  the code below from this forum:
cmd <- paste("ps -o vsz", Sys.getpid()) 
cat("\nVirtual size: ", system(cmd, intern = TRUE) [2], "\n", sep = "") 

I did three runs of the old version, and three with the new, preceding each
with gc() & got the outputs below. In summary, the runs of old method
required 17712, 17744 & 17744 & runs of new method required 13788, 15140 &
13656. 

Two questions:
1. why does each run of the same process not make the same demand on memory?
They're doing exactly the same work & creating exactly the same new objects.
2. is the modest decrease in memory consumed by new method expected? (having
read R_Inferno I was, perhaps naively, expecting more of an improvement)

? or am I missing something (more than likely! )

Thanks

M




> gc()
         used (Mb) gc trigger (Mb) max used (Mb)
Ncells 786300 21.0    1265230 33.8  1166886 31.2
Vcells 948412  7.3    3244126 24.8  3766604 28.8

> cat("old version")

Virtual size before call: 881692 
   user  system elapsed 
131.872  14.417 159.653 

Virtual size after call: 899404  
> 899404-881692
[1] 17712

##################

> gc()
         used (Mb) gc trigger (Mb) max used (Mb)
Ncells 786294 21.0    1265230 33.8  1166886 31.2
Vcells 948407  7.3    3244126 24.8  3766604 28.8

> cat("old version")

Virtual size before call: 881660 
   user  system elapsed 
133.281  14.473 159.661 
 
Virtual size after call: 899440  
> 899440-881660
[1] 17780

##################

> gc()
         used (Mb) gc trigger (Mb) max used (Mb)
Ncells 786294 21.0    1265230 33.8  1166886 31.2
Vcells 948407  7.3    3244126 24.8  3766604 28.8

> cat("old version")

Virtual size before call: 881696 
   user  system elapsed 
133.436  14.257 155.807 
 

Virtual size after call: 899440  
> 899440-881696
[1] 17744

################## ##################

> gc()
         used (Mb) gc trigger (Mb) max used (Mb)
Ncells 786413 21.0    1265230 33.8  1166886 31.2
Vcells 948460  7.3    3244126 24.8  3766604 28.8

> cat("new version")

Virtual size before call: 881696

   user  system elapsed 
 16.041   1.180  18.535 
 

Virtual size after call: 895484  
> 895484-881696
[1] 13788

##################

> gc()
         used (Mb) gc trigger (Mb) max used (Mb)
Ncells 786441 21.1    1265230 33.8  1166886 31.2
Vcells 948480  7.3    3244126 24.8  3766604 28.8

> cat("new version")

Virtual size before call: 882648 
   user  system elapsed 
 16.321   1.068  18.136 
 
Virtual size after call: 897788  
> 897788- 882648
[1] 15140

##################

> gc()
         used (Mb) gc trigger (Mb) max used (Mb)
Ncells 786441 21.1    1265230 33.8  1166886 31.2
Vcells 948480  7.3    3244126 24.8  3766604 28.8

> cat("new version")

Virtual size before call: 882648

   user  system elapsed 
 16.581   0.992  19.351 

Virtual size after call: 896304
> 896304-882648
[1] 13656





-- 
View this message in context: http://r.789695.n4.nabble.com/speed-issues-read-R-inferno-by-Patrick-Burns-a-memory-query-tp3162032p3162032.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list