[Rd] Style question
Winston Chang
winstonchang1 at gmail.com
Fri May 30 23:40:05 CEST 2014
Using `::` does add some overhead - on the order of 5-10 microseconds
on my computer. Still, it would take 100,000 calls to add 0.5-1 second
of delay.
microbenchmark(
base::identity(1),
identity(1),
unit = "us"
)
# Unit: microseconds
# expr min lq median uq max neval
# base::identity(1) 5.677 6.2180 6.6695 7.3655 60.104 100
# identity(1) 0.262 0.2965 0.3210 0.4035 1.034 100
This test isn't exactly like putting identity in imports, since in
this case, the number environments to search is greater -- but it's
reasonably close.
If you're in a situation where you want to be explicit about where a
function came from, but the slowness of `::` is an issue, you could
create a variable that points to the environment and access the
function using $:
base <- as.environment('package:base')
microbenchmark(
base::identity(1),
base$identity(1),
identity(1),
unit = "us"
)
# Unit: microseconds
# expr min lq median uq max neval
# base::identity(1) 5.520 6.0795 6.4485 7.0020 32.232 100
# base$identity(1) 0.504 0.5940 0.6635 0.8105 7.701 100
# identity(1) 0.248 0.2815 0.3100 0.3885 7.925 100
-Winston
On Fri, May 30, 2014 at 2:53 PM, Hervé Pagès <hpages at fhcrc.org> wrote:
> Hi Gabe,
>
>
> On 05/30/2014 11:34 AM, Gabriel Becker wrote:
>>
>> This isn't likely to make much difference in most cases, but calling a
>> function via :: can incur up to about twice the overhead on average
>> compared to calling an imported function
>>
>> > fun1
>> function ()
>> file_ext("text.txt")
>> <environment: namespace:imptest>
>> > fun2
>> function ()
>> tools::file_ext("text.txt")
>> <environment: namespace:imptest>
>> > microbenchmark(fun1(), times=10000)
>> Unit: microseconds
>> expr min lq median uq max neval
>> fun1() 24.506 25.654 26.324 27.8795 154.001 10000
>> > microbenchmark(fun2(), times=10000)
>> Unit: microseconds
>> expr min lq median uq max neval
>> fun2() 42.723 46.6945 48.8685 52.0595 2021.91 10000
>
>
> Interesting. Or with a void function so the timing more closely
> reflects the time it takes to look up the symbol:
>
> > void
> function ()
> NULL
> <environment: namespace:S4Vectors>
>
> > fun1
> function ()
> void()
> <environment: namespace:IRanges>
>
> > fun2
> function ()
> S4Vectors::void()
> <environment: namespace:IRanges>
>
> > microbenchmark(fun1(), times=10000)
> Unit: nanoseconds
>
> expr min lq median uq max neval
> fun1() 261 268 270 301 11960 10000
>
> > microbenchmark(fun2(), times=10000)
> Unit: microseconds
> expr min lq median uq max neval
> fun2() 13.486 14.918 15.782 16.753 60542.19 10000
>
> S4Vectors::void() is about 60x slower than void()!
>
> Cheers,
> H.
>
>>
>> Also, if one uses roxygen2 (or even if one doesn't) ##'@importFrom above
>> the function doing the calling documents this.
>>
>> And of course if you need to know where a function lives environment
>> will tell you.
>>
>> ~G
>>
>>
>> On Fri, May 30, 2014 at 10:00 AM, Hadley Wickham <h.wickham at gmail.com
>> <mailto:h.wickham at gmail.com>> wrote:
>>
>> > There is at least one subtle consequence to keep in mind when doing
>> > this. Of course, whatever choice you make, if the whatever()
>> function
>> > moves to a different package, this breaks your package.
>> > However, if you explicitly import the function, your package will
>> > break at load-time (which is good) and you'll only have to modify
>> > 1 line in the NAMESPACE file to fix it. But if you do
>> foo::whatever(),
>> > your package won't break at load-time, only at run-time. Also
>> you'll
>> > have to edit all the calls to foo::whatever() to fix the package.
>> >
>> > Probably not a big deal, but in an environment like Bioconductor
>> where
>> > infrastructure classes and functions can be shared by hundreds of
>> > packages, having people use foo::whatever() in a systematic way
>> would
>> > probably make maintenance a little bit more painful than it needs
>> to
>> > be when the need arises to reorganize/refactor parts of the
>> > infrastructure. Also, the ability to quickly grep the NAMESPACE
>> > files of all BioC packages to see who imports what is very
>> convenient
>> > in this situation.
>>
>> OTOH, I think there's a big benefit to being able to read package code
>> and instantly know where a function comes from.
>>
>> Personally, I found this outweighs the benefits that you outline:
>>
>> * functions rarely move between packages, and gsubbing for pkga:foo to
>> pkgb:foo isn't hard
>> * it's not that much hard to grep for pkg::foo in R/* than it is to
>> grep NAMESPACE
>>
>> Hadley
>>
>> --
>> http://had.co.nz/
>>
>> ______________________________________________
>> R-devel at r-project.org <mailto:R-devel at r-project.org> mailing list
>>
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>>
>>
>> --
>> Gabriel Becker
>> Graduate Student
>> Statistics Department
>> University of California, Davis
>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org
> Phone: (206) 667-5791
> Fax: (206) 667-1319
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
More information about the R-devel
mailing list