[Rd] [External] Re: JIT compiler does not compile closures with custom environments

iuke-tier@ey m@iii@g oii uiow@@edu iuke-tier@ey m@iii@g oii uiow@@edu
Wed Aug 18 16:45:56 CEST 2021


On Wed, 18 Aug 2021, Duncan Murdoch wrote:

> On 18/08/2021 9:00 a.m., Taras Zakharko wrote:
>> I have encountered a behavior of R’s JIT compiler that I can’t quite figure 
>> out. Consider the following code:
>> 
>>
>>     f_global <- function(x) {
>>       for(i in 1:10000) x <- x + 1
>>       x
>>     }
>>
>>     f_env <- local({
>>      function(x) {
>>        for(i in 1:10000) x <- x + 1
>>        x
>>      }
>>     })
>>
>>     compiler::enableJIT(3)
>>
>>    bench::mark(f_global(0), f_env(0))
>>    # 1 f_global(0)    103µs 107.61µs     8770.    11.4KB      0    4384 
>> 0
>>    # 2 f_env(0)       1.1ms   1.42ms      712.        0B     66.3   290 
>> 27
>>    Inspecting the closures shows that f_global has been byte-compiled while 
>> f_env has not been byte-compiled. Furthermore, if I assign a new 
>> environment to f_global (e.g. via environment(f_global) <- new.env()), it 
>> won’t be byte-compiled either.
>> 
>> However, if I have a function returning a closure, that closure does get 
>> byte-compiled:
>>
>>    f_closure <- (function() {
>>      function(x) {
>>        for(i in 1:10000) x <- x + 1
>>       x
>>     }
>>    })()
>>
>>    bench::mark(f_closure(0))
>>    # 1 f_closure(0)    105µs    109µs     8625.        0B     2.01  4284 
>> 1      497ms
>> 
>> What is going on here? Both f_closure and f_env have non-global 
>> environments. Why is one JIT-compiled, but not the other? Is there a way to 
>> ensure that functions defined in environments will be JIT-compiled?
>
> About what is going on in f_closure:  I think the anonymous factory
>
> function() {
>      function(x) {
>        for(i in 1:10000) x <- x + 1
>       x
>     }
>    }
>
> got byte compiled before first use, and that compiled its result.  That seems 
> to be what this code indicates:
>
>  f_closure <- (function() {
>  res <- function(x) {
>  for(i in 1:10000) x <- x + 1
>  x
>  }; print(res); res
>  })()
>  #> function(x) {
>  #> for(i in 1:10000) x <- x + 1
>  #> x
>  #> }
>  #> <bytecode: 0x7fb43ec3aa70>
>  #> <environment: 0x7fb441117ac0>

That is right.

> But even if that's true, it doesn't address the bigger question of why 
> f_global and f_env are treated differently.

There are various heuristics in the JIT code to avoid spending too
much time in the JIT. The current details are in the source
code. Mostly this is to deal with usually ill-advised coding practices
that programmatically build many small functions.  Hopefully these
heuristics can be reduced or eliminated over time.

For now, putting the code in a package, where the default is to byte
compile on source install, or explicitly calling compiler::cmpfun are
options.

Best,

luke

>
> Duncan Murdoch
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney using uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu


More information about the R-devel mailing list