[Rd] Unexpected argument-matching when some are missing

Emil Bode emil@bode @ending from d@n@@kn@w@nl
Thu Nov 29 17:47:21 CET 2018

Well, I did mean it as "missing".
To me, it felt just as natural as providing an empty index for subsetting (e.g. some.data.frame[,,drop=FALSE])
I can't think of a whole lot of other uses than subsetting, but I think this issue may be mostly important when you're not entirely sure what a call is going to end up, when passing along arguments, or when calling an unknown function (as in variants of the apply-family, where you provide a function as an argument).
Or what happens if I use do.call(FUN, args=MyNamedList)? I have a bit more extensive example further down where you can more clearly see the unexpected output.

But the problem is that R does NOT treat it as simply "missing". That would have been reasonable, but instead, as in the example in my previous mail, 
myfun(x=, y=, "z's value") means x is assigned "z's value", and y and z are seen as missing. Which is not at all what I was expecting.

And is also not consistent with other behaviour, as myfun(,,"z's value") and myfun(x=, y=, z="z's value") do work as expected (at least what I was expecting)

The extensice example:
Suppose I want to write a function that selects data from some external source. In order to do this, we put the data in its own environment, where we look for variables called "df", "rows", "cols" and "drop", and use these to make a selection. I write this function:

doselect <- function(env) {
  do.call(`[.data.frame`, list(env$df, if(!is.null(env$rows)) env$rows, if(!is.null(env$cols)) env$cols, drop=if(!is.null(env$drop)) env$drop))

It works for this code:
myenv <- new.env()
assign('df', data.frame(a=1:2, b=3:4), myenv, inherits=FALSE)
assign('rows', 1, myenv, inherits=FALSE) # Code breaks if we don't have this line
assign('cols', 1, myenv, inherits=FALSE) # Code breaks if we don't have this line
assign('drop', FALSE, myenv, inherits=FALSE)

But if we don't assign "rows" and/or "cols", the variable "drop" is inserted in the place of the first unnamed variable, so the result is the same as if calling
[1] a b
<0 rows> (or 0-length row.names)

What I did expect was the same result as df[,,FALSE], i.e. the full data.frame. Of course I can rewrite the function "doselect", but I think my current call is how most people would write it (even though I admit the example in its entirety is far-fetched)

Best regards, 
Emil Bode

On 29/11/2018, 14:58, "Ista Zahn" <istazahn using gmail.com> wrote:

    On Thu, Nov 29, 2018 at 5:09 AM Emil Bode <emil.bode using dans.knaw.nl> wrote:
    > When trying out some variations with `[.data.frame` I noticed some (to me) odd behaviour, which I found out has nothing to do with `[.data.frame`, but rather with the way arguments are matched, when mixing named/unnamed and missing/non-missing arguments. Consider the following example:
    > myfun <- function(x,y,z) {
    >   print(match.call())
    >   cat('x=',if(missing(x)) 'missing' else x, '\n')
    >   cat('y=',if(missing(y)) 'missing' else y, '\n')
    >   cat('z=',if(missing(z)) 'missing' else z, '\n')
    > }
    > myfun(x=, y=, "z's value")
    > gives:
    > # myfun(x = "z's value")
    > # x= z's value
    > # y= missing
    > # z= missing
    > This seems very counterintuitive to me, I expect the arguments x and y to be missing, and z to get “z’s value”.
    Interesting. I would expect it to throw an error, since "x=" is not
    syntactically complete. What does "x=" mean anyway? It looks like R
    interprets it as "x was not set to anything, i.e., is missing". That
    seems reasonable, though I think the example itself is pathological
    and would prefer that it produced an error.
    > When I call myfun(,y=,"z's value"), x is missing, and y gets “z’s value”.
   > Are my expectations wrong or is this a bug? And if my expectations are wrong, where can I find more information on argument-matching?
   > My gut-feeling says to call this a bug, but then I’m surprised no-one else has encountered it before.
    > And I don’t have multiple installations to work from, so could somebody else confirm this (if it’s not my expectations that are wrong) for R-devel/other R-versions/other platforms?
    > My setup: R 3.5.1, MacOS 10.13.6, both Rstudio 1.1.453 and R --vanilla from Bash
    > Best regards,
    > Emil Bode
    > ______________________________________________
    > R-devel using r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel

More information about the R-devel mailing list