[Rd] [External] Re: JIT compiler does not compile closures with custom environments

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Wed Aug 18 16:56:29 CEST 2021


On 18/08/2021 10:45 a.m., luke-tierney using uiowa.edu wrote:
> On Wed, 18 Aug 2021, Duncan Murdoch wrote:
> 
>> On 18/08/2021 9:00 a.m., Taras Zakharko wrote:
>>> I have encountered a behavior of R’s JIT compiler that I can’t quite figure
>>> out. Consider the following code:
>>>
>>>
>>>      f_global <- function(x) {
>>>        for(i in 1:10000) x <- x + 1
>>>        x
>>>      }
>>>
>>>      f_env <- local({
>>>       function(x) {
>>>         for(i in 1:10000) x <- x + 1
>>>         x
>>>       }
>>>      })
>>>
>>>      compiler::enableJIT(3)
>>>
>>>     bench::mark(f_global(0), f_env(0))
>>>     # 1 f_global(0)    103µs 107.61µs     8770.    11.4KB      0    4384
>>> 0
>>>     # 2 f_env(0)       1.1ms   1.42ms      712.        0B     66.3   290
>>> 27
>>>     Inspecting the closures shows that f_global has been byte-compiled while
>>> f_env has not been byte-compiled. Furthermore, if I assign a new
>>> environment to f_global (e.g. via environment(f_global) <- new.env()), it
>>> won’t be byte-compiled either.
>>>
>>> However, if I have a function returning a closure, that closure does get
>>> byte-compiled:
>>>
>>>     f_closure <- (function() {
>>>       function(x) {
>>>         for(i in 1:10000) x <- x + 1
>>>        x
>>>      }
>>>     })()
>>>
>>>     bench::mark(f_closure(0))
>>>     # 1 f_closure(0)    105µs    109µs     8625.        0B     2.01  4284
>>> 1      497ms
>>>
>>> What is going on here? Both f_closure and f_env have non-global
>>> environments. Why is one JIT-compiled, but not the other? Is there a way to
>>> ensure that functions defined in environments will be JIT-compiled?
>>
>> About what is going on in f_closure:  I think the anonymous factory
>>
>> function() {
>>       function(x) {
>>         for(i in 1:10000) x <- x + 1
>>        x
>>      }
>>     }
>>
>> got byte compiled before first use, and that compiled its result.  That seems
>> to be what this code indicates:
>>
>>   f_closure <- (function() {
>>   res <- function(x) {
>>   for(i in 1:10000) x <- x + 1
>>   x
>>   }; print(res); res
>>   })()
>>   #> function(x) {
>>   #> for(i in 1:10000) x <- x + 1
>>   #> x
>>   #> }
>>   #> <bytecode: 0x7fb43ec3aa70>
>>   #> <environment: 0x7fb441117ac0>
> 
> That is right.
> 
>> But even if that's true, it doesn't address the bigger question of why
>> f_global and f_env are treated differently.
> 
> There are various heuristics in the JIT code to avoid spending too
> much time in the JIT. The current details are in the source
> code. Mostly this is to deal with usually ill-advised coding practices
> that programmatically build many small functions.  Hopefully these
> heuristics can be reduced or eliminated over time.
> 
> For now, putting the code in a package, where the default is to byte
> compile on source install, or explicitly calling compiler::cmpfun are
> options.
> 

Thanks!  Putting code in a package seems easiest, and is a good idea for 
lots of other reasons.

Duncan Murdoch



More information about the R-devel mailing list