[Rd] Error "promise already under evaluation ..." with function(x, dim=dim(x))

Mon Nov 24 01:42:35 CET 2014

On Sun, Nov 23, 2014 at 4:07 PM, Henrik Bengtsson <hb at biostat.ucsf.edu> wrote:
> On Sat, Nov 15, 2014 at 1:47 AM, Duncan Murdoch
> <murdoch.duncan at gmail.com> wrote:
>>
>> On 14/11/2014, 9:06 PM, Henrik Bengtsson wrote:
>> > I've meant to ask the following for several years now.  I understand why:
>> >
>> >> foo <- function(x, dim=dim) { dim }
>> >> foo(1)
>> > Error in foo(1) :
>> >   promise already under evaluation: recursive default argument
>> > reference or earlier problems?
>> >
>> > gives an error, but why wouldn't/couldn't the following work?
>> >
>> >> foo <- function(x, dim=dim(x)) { dim }
>> >> foo(1)
>> > Error in foo(1) :
>> >   promise already under evaluation: recursive default argument
>> > reference or earlier problems?
>>
>> You refer to "dim".  There's a dim defined in the argument list, so R
>> uses that definition of it.
>>
>> But you didn't supply any value, so it tries to evaluate the default
>> value.  Default expressions are always evaluated in the evaluation frame
>> of the function call, so it looks for a function named "dim" in the
>> local frame.
>>
>> It finds the argument in the local frame, so it tries to figure out if
>> it is a function or a value.  It needs to evaluate it to do that, and
>> you get the recursion.
>>
>> >
>> > As a workaround I also tried:
>> >
>> >> foo <- function(x, dim) { if (missing(dim)) dim <- dim(x); dim }
>> >> foo(1)
>> > Error in foo(1) : argument "dim" is missing, with no default
>> >
>> > which surprised me too.
>> >
>> >
>> > For the first case, is the rationale related to:
>> >
>> >> foo <- function(x, a=dim(x), dim) { a }
>> >> foo(1)
>> > Error in foo(1) : argument "dim" is missing, with no default
>> >
>> > and
>> >
>> >> foo <- function(x, a=dim(x), dim=a) { a }
>> >> foo(1)
>> > Error in foo(1) :
>> >   promise already under evaluation: recursive default argument
>> > reference or earlier problems?
>> >
>> > [since here argument 'dim' could take a function, e.g. foo(1,
>> > dim=length)], and that R treats
>> >
>> > foo <- function(x, dim=dim(x)) { dim }
>> >
>> > in a similar way?  That is, is R not "clever" enough to detect this as
>> > a special case, but instead goes ahead and tries to evaluate the
>> > default expression (=dim(x)) of argument 'dim' in order to get its
>> > default value?  If so, is there anything preventing R from support
>> > this "special case", e.g. by evaluating the default expression without
>> > argument/symbol 'dim' itself being in the picture to avoid "it finds
>> > itself"?  (Sorry if I'm using the incorrect words here).
>>
>> No, it shouldn't do that.  It should use consistent rules for evaluation
>> or there would be sure to be bugs.
>>
>> >
>> > Yes, I understand that I can do:
>> >
>> >> foo <- function(x, dim=base::dim(x)) { dim }
>>
>> This is what you should do.
>>
>> >> foo(1)
>> > NULL
>> >
>> >> foo <- function(x, dim=NULL) { if (is.null(dim)) dim <- dim(x); dim }
>>
>> This works, because when R is looking up the function dim(), it can
>> evaluate the local argument dim and see it is not a function, so it
>> proceeds to the parent frame.
>>
>> >> foo(1)
>> > NULL
>> >
>> > or
>> >
>> >> foo <- function(x, dim.=dim(x)) { dim. }
>> >> foo(1)
>> > NULL
>>
>> This is another solution that works, but it has the ugly argument name
>> now, so you'll get warnings during package checks from calls like
>>
>> foo(1, dim=2)
>>
>> >
>> > but I would prefer not to have to turn those rather ad hoc solutions in my code.
>>
>> Nothing ad hoc about the first one.
>
> Thanks for the feedback.  I agree that base::dim(x) is clean and
> clear, but unfortunately there is a ~500 times overhead in using '::'.
> Since I went through the effort of doing the benchmarking and find
> faster solutions, I'm sharing the following:
>
>> library("microbenchmark")
>
>> x <- matrix(1:(80*80), nrow=80)
>
>> # Not "legal", because it calls .Primitive().
>> dim_illegal <- base::dim
>
>> dim_R <- function(x) {
> +   ns <- getNamespace("base")
> +   dim <- get("dim", envir=ns, inherits=FALSE, mode="function")
> +   dim(x)
> + }
>
>> dim_R_memoized <- local({
> +   dim <- NULL
> +   function(x) {
> +     if (is.null(dim)) {
> +       dim <<- get("dim", envir=getNamespace("base"), inherits=FALSE,
> mode="function")
> +     }
> +     dim(x)
> +   }
> + })
>
>> stats <- microbenchmark(
> +   dim(x),
> +   base::dim(x),
> +   dim_R(x),
> +   dim_R_memoized(x),
> +   dim_illegal(x),
> +   sum(x),
> +   unit="ns",
> +   times=10e3
> + )
> Warning message:
> In microbenchmark(dim(x), base::dim(x), dim_R(x), dim_R_memoized(x),  :
>   Could not measure a positive execution time for 3859 evaluations.
>
>> print(stats)
> Unit: nanoseconds
>               expr  min   lq       mean median    uq     max neval   cld
>             dim(x)    0    0    25.2226      1     1   10780 10000 a
>       base::dim(x) 6545 7700 10429.0165   8470 12897 2678155 10000     e
>           dim_R(x) 3080 3851  5163.8612   4236  6545   55435 10000   c
>  dim_R_memoized(x)  385  771  1238.8292   1156  1541   44656 10000  b
>     dim_illegal(x)    0    1    51.4421      1     1    5775 10000 a
>             sum(x) 8085 8470  9590.9570   8470 10395   49660 10000    d
>
> Yes, yes, the extra cost of using base::dim(x) is only ~10 us, but if
> you do, say, a million bootstrap samples calling this function, that's
> an extra unnecessary 10 seconds of processing time.  As a comparison,
> the overhead is roughly the same as summing 6400 integers.
>
> For workarounds, I considered:
>
> (a) dim_illegal
> (b) dim_R
> (c) dim_R_memoized
>
> where,
>
> (a) would be "good enough", but can immediately be discarded because
> if used in a package, it will create a copy of base::dim and thereby
> call .Primitive() immediately, which is unsafe.
>
> (b) is a poor-mans version of try to cut the corners of '::', but
> there is still a substantial overhead in each call, but still a 25-50%
> speedup compared to '::'.
>
> (c) is a smarter version of (b) that does the look up only ones, and
> managed to reduce the overhead to 10% of '::'.  It's still 50 times
> the overhead of a direct dim(x) call.
>
> Since one can byte compile packages (ByteCompile: TRUE in
> DESCRIPTION), I've also played around with compiler::cmpfun() and that
> prunes off about 10% of the non-compiled ditto.  I was
> somewhat/naively hoping that the compiler would be able to compile
> base::dim into a "constant", but that doesn't seem to be the case.
>
>
> BTW, is the following, which is ~2 times as fast as dim_R_memoized(),
> valid in an R package?   Will it set the local 'dim' variable when the
> package is loaded, which I assume is safe/legal, or before?  I didn't
> include it above, because I wasn't sure it was safe/valid.
>
> dim_R_memoized_2 <- local({
>   dim_local <- base::dim
>   function(x) dim_local(x)
> })

The above question still stands, but otherwise, I overlooked the most
obvious solution:

dim_1 <- function(x) dim(x)

which is favorable when benchmarked (~10 times slower than a direct
dim(x) call but otherwise the one of the fastest solutions):

Unit: nanoseconds
                  expr   min    lq      mean median    uq     max neval cld
                dim(x)     0     1    72.941      1     1    2696  1000 a
          base::dim(x) 11549 13474 15105.950  14245 15399   60824  1000   c
              dim_1(x)     1   771  2801.544    771  1156 1806225  1000 a
              dim_R(x)  5390  6930  8077.753   7315  8085  249069  1000  b
     dim_R_memoized(x)  1156  1926  2520.119   2310  2695   73528  1000 a
   dim_R_memoized_2(x)   385   771  1089.243    771  1156   20019  1000 a
        dim_illegal(x)     0     1   161.480      1   386    2311  1000 a
                sum(x) 10395 15784 16459.454  15785 16169  114333  1000   c

So, my best shot on the original problem would now be to either use:

dim2 <- function(x) dim(x)
foo <- function(x, dim=dim2(x)) { dim }

or simply avoid the name clash via:

foo <- function(x, dimx=dim(x)) { dimx }

/Henrik

>
>
> Thanks,
>
> /Henrik
>
>
>>
>> Duncan Murdoch
>>
>> >
>> >
>> > Thanks,
>> >
>> > Henrik
>> >
>> > ______________________________________________
>> > R-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>>