[Rd] Most efficient way to check the length of a variable mentioned in a formula.
Gabriel Becker
gmbecker at ucdavis.edu
Fri Oct 17 20:23:04 CEST 2014
Joris,
For me
length(environment(form)[["x"]])
Was about twice as fast as
length(get("x",environment(form))))
In the year-old version of R (3.0.2) that I have on the virtual machine i'm
currently using.
As for you, the eval method was much slower (though my factor was much
larger than 20)
> system.time({thing <- replicate(10000,length(environment(form)[["x"]]))})
user system elapsed
0.018 0.000 0.018
> system.time({thing <-
replicate(10000,length(get("x",environment(form))))}) user system
elapsed
0.031 0.000 0.033
> system.time({thing <- replicate(10000,eval(parse(text = "length(x)"),
envir=environment(form)))})
user system elapsed
4.528 0.003 4.656
I can't speak this second to whether this pattern will hold in the more
modern versions of R I typically use.
~G
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
On Fri, Oct 17, 2014 at 11:04 AM, Joris Meys <jorismeys at gmail.com> wrote:
> Dear R gurus,
>
> I need to know the length of a variable (let's call that X) that is
> mentioned in a formula. So obviously I look for the environment from which
> the formula is called and then I have two options:
>
> - using eval(parse(text='length(X)'),
> envir=environment(formula) )
>
> - using length(get('X'),
> envir=environment(formula) )
>
> a bit of benchmarking showed that the first option is about 20 times
> slower, to that extent that if I repeat it 10,000 times I save more than
> half a second. So speed is not really an issue here.
>
> Personally I'd go for option 2 as that one is easier to read and does the
> job nicely, but with these functions I'm always a bit afraid that I'm
> overseeing important details or side effects here (possibly memory issues
> when working with larger data).
>
> Anybody an idea what the dangers are of these methods, and which one is the
> most robust method?
>
> Thank you
> Joris
>
> --
> Joris Meys
> Statistical consultant
>
> Ghent University
> Faculty of Bioscience Engineering
> Department of Mathematical Modelling, Statistics and Bio-Informatics
>
> tel : +32 9 264 59 87
> Joris.Meys at Ugent.be
> -------------------------------
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Gabriel Becker
Graduate Student
Statistics Department
University of California, Davis
[[alternative HTML version deleted]]
More information about the R-devel
mailing list