[Rd] data frame subset patch

Vladimir Dergachev vdergachev at rcgardis.com
Tue Nov 28 21:54:54 CET 2006


Hi all, 

   Here is a patch that significantly speeds up `[.data.frame` operator.
It applies cleanly to both 2.4.0 and svn trunk. Make check was OK for 2.40.
(for svn trunk it fails even without this patch.. ).

   What it does - we get rid of class and attr statements that modify incoming 
data frame and use explicit calls to .subset and .subset2 instead.

Test case:

N<-100000
T<-data.frame(a=1:N, b=rnorm(N), c=as.character(round(runif(N)*10)))
system.time({X<-0 ; for(i in 1:1000)X<-X+T[i,2]})

Without patch the output on my system is
[1]  8.488  2.436 10.926  0.000  0.000


With this patch the output is:
[1] 1.084 0.624 1.707 0.000 0.000

                    thank you !

                             Vladimir Dergachev


More information about the R-devel mailing list