[Rd] proposal for adapting code of function gl()

Joris Meys jorismeys at gmail.com
Mon Apr 11 23:53:52 CEST 2011


Based on a discussion on SO I ran some tests and found that converting
to a factor is best done early in the process. Hence, I propose to
rewrite the gl() function as :

gl2 <- function(n, k, length = n * k, labels = 1:n, ordered = FALSE){
  rep(
      rep(
        factor(1:n,levels=1:n,labels=labels, ordered=ordered),rep.int(k,n)
      ),length.out=length
  )
}

Some test results  :

> system.time(X1 <- gl(5,1e7))
   user  system elapsed
  29.21    0.30   29.58

> system.time(X2 <- gl2(5,1e7))
   user  system elapsed
   1.87    0.45    2.37

> all.equal(X1,X2)
[1] TRUE

> system.time(X1 <- gl(5,100,1e7))
   user  system elapsed
   5.98    0.05    6.05

> system.time(X2 <- gl2(5,100,1e7))
   user  system elapsed
   0.21    0.03    0.25

> all.equal(X1,X2)
[1] TRUE

> system.time(X1 <- gl(5,100,1e7,labels=letters[1:5]))
   user  system elapsed
   5.88    0.02    5.98

> system.time(X2 <- gl2(5,100,1e7,labels=letters[1:5]))
   user  system elapsed
   0.20    0.05    0.25

> all.equal(X1,X2)
[1] TRUE

> system.time(X1 <- gl(5,100,1e7,labels=letters[1:5],ordered=T))
   user  system elapsed
   5.82    0.03    5.89

> system.time(X2 <- gl2(5,100,1e7,labels=letters[1:5],ordered=T))
   user  system elapsed
   0.22    0.04    0.25

> all.equal(X1,X2)
[1] TRUE

reference to SO :
http://stackoverflow.com/questions/5627264/how-can-i-efficiently-construct-a-very-long-factor-with-few-levels

-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
Joris.Meys at Ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php



More information about the R-devel mailing list