[Rd] named arguments discouraged in `[.data.frame` and `[<-.data.frame`

Emil Bode emil@bode @ending from d@n@@kn@w@nl
Thu Nov 29 10:48:21 CET 2018


Well, the situation with `[.data.frame` (and [<-) is complicated by the fact that the data.frame-method is not a primitive, but the generic IS. 
I'm not sure about dispatch for primitive-generics, but I bet it's done on the first argument (as with S3). Which means `[`(j=1:2,d,i=1) has nothing to do with `[.data.frame`, as some internal code equivalent to something like `[.integer` is called (`[.integer` is not an R-function, but I guess it's implemented in the C-code for `[`)
And note that `[.data.frame`(j=1:2,d,i=1) does work (throws a warning, but returns the right result), because then you're simply calling the direct R-function, and matching by name is done.

But I think the main reason for the warning is forwards compatibility (and maybe backwards?). As of this version, `[.data.frame`(x = d, j = 2, i = 1) works fine, and `[.data.frame` is a regular R-function. But it's used a lot, I wouldn't be surprised if some future R-version would implement it as a primitive.
Without the warning, implementing [.data.frame as a primitive would involve a LOT of issues where older code breaks. With the warning, we can make clear to any users that calls like this one are undefined. They may work for now, but one shouldn't rely on it. Which means only the "right" order may be used, and then naming them is superfluous.

By the way, when trying some things I noticed something else, which I'll send a separate mail about...

Cheers,
Emil 

On 29/11/2018, 09:20, "R-devel on behalf of Henrik Pärn" <r-devel-bounces using r-project.org on behalf of henrik.parn using ntnu.no> wrote:

    Thanks Bill and Michael for taking the time to share your knowledge! 
    
    As a further background to my question, here are two examples that I forgot to include in my original post (reminded by Michael's answer). I swapped the i and j arguments in `[.data.frame` and `[<-.data.frame`. With warnings, but else without (?) problem. Using Bill's data:
    
    `[.data.frame`(x = d, i = 1, j = 2)
    # [1] 12
    
    `[.data.frame`(x = d, j = 2, i = 1)
    # [1] 12
    
    And similar for `[<-.data.frame` :
    `[<-.data.frame`(x = d, i = 1, j = 2, value = 1122)
    `[<-.data.frame`(x = d, j = 2, i = 1, value = 12)
    
    Because this seemed to work, I made the hasty conclusion that argument switching _wasn't_ a problem for `[.data frame`, and that we could rely on exact matching on tags. But apparently not: despite that `[.data.frame` and `[<-.data.frame` are _not_ primitive functions, positional matching is done there as well. Sometimes. At least when 'x' argument is not first, as shown in Bill's examples. Obviously my "test" was insufficient...
    
    Cheers,
    
    Henrik
    
    
    
    From: William Dunlap <wdunlap using tibco.com> 
    Sent: Wednesday, November 28, 2018 9:10 PM
    To: Henrik Pärn <henrik.parn using ntnu.no>
    Cc: r-devel using r-project.org
    Subject: Re: [Rd] named arguments discouraged in `[.data.frame` and `[<-.data.frame`
    
    They can get bitten in the last two lines of this example, where the 'x' argument is not first:
    > d <- data.frame(C1=c(r1=11,r2=21,r3=31), C2=c(12,22,32))
    > d[1,1:2]
       C1 C2
    r1 11 12
    > `[`(d,j=1:2,i=1)
       C1 C2
    r1 11 12
    Warning message:
    In `[.data.frame`(d, j = 1:2, i = 1) :
      named arguments other than 'drop' are discouraged
    > `[`(j=1:2,d,i=1)
    Error in (1:2)[d, i = 1] : incorrect number of dimensions
    > do.call("[", list(j=1:2, i=1, x=d))
    Error in 1:2[i = 1, x = list(C1 = c(11, 21, 31), C2 = c(12, 22, 32))] :
      incorrect number of dimensions
    
    Bill Dunlap
    TIBCO Software
    wdunlap http://tibco.com
    
    
    On Wed, Nov 28, 2018 at 11:30 AM Henrik Pärn <mailto:henrik.parn using ntnu.no> wrote:
    tl;dr:
    
    Why are named arguments discouraged in `[.data.frame`, `[<-.data.frame` and `[[.data.frame`?
    
    (because this question is of the kind 'why is R designed like this?', I though R-devel would be more appropriate than R-help)
    
    #############################
    
    Background:
    
    Now and then students presents there fancy functions like this: 
    
    myfancyfun(d,12,0.3,0.2,500,1000,FALSE,TRUE,FALSE,TRUE,FALSE)
    
    Incomprehensible. Thus, I encourage them to use spaces and name arguments, _at least_ when trying to communicate their code with others. Something like:
    
    myfancyfun(data = d, n = 12, gamma = 0.3, prob = 0.2,
                          size = 500, niter = 1000, model = FALSE,
                         scale = TRUE, drop = FALSE, plot = TRUE, save = FALSE)
    
    
    Then some overzealous students started to use named arguments everywhere. E-v-e-r-y-w-h-e-r-e. Even in the most basic situation when indexing vectors (as a subtle protest?), like:
    
    vec <- 1:9
    
    vec[i = 4]
    `[`(x = vec, i = 4)
    
    vec[[i = 4]]
    `[[`(x = vec, i = 4)
    
    vec[i = 4] <- 10
    `[<-`(x = vec, i = 4, value = 10)
    
    ...or when indexing matrices:
    
    m <- matrix(vec, ncol = 3)
    m[i = 2, j = 2]
    `[`(x = m, i = 2, j = 2)
    # 5
    
    m[i = 2, j = 2] <- 0
    `[<-`(x = m, i = 2, j = 2, value = 0)
    
    ######
    
    This practice indeed feels like overkill, but it didn't seem to hurt either. Until they used it on data frames. Then suddenly warnings appeared that named arguments are discouraged:
    
    d <- data.frame(m)
    
    d[[i = "X2"]]
    # [1] 4 5 6
    # Warning message:
    # In `[[.data.frame`(d, i = "X2") :
    #  named arguments other than 'exact' are discouraged
    
    d[i = 2, j = 2]
    # [1] 0
    # Warning message:
    # In `[.data.frame`(d, i = 2, j = 2) :
    #  named arguments other than 'drop' are discouraged
    
    d[i = 2, j = 2] <- 5
    # Warning message:
    # In `[<-.data.frame`(`*tmp*`, i = 2, j = 2, value = 5) :
    #  named arguments are discouraged
    
    
    ##################################
    
    Of course I could tell them "don't do it, it's overkill and not common practice" or "it's just a warning, don't worry". However, I assume the warnings are there for a good reason.
    
    So how do I explain to the students that named arguments are actively discouraged in `[.data.frame` and `[<-.data.frame`, but not in `[` and `[<-`? When will they get bitten?
    
    ______________________________________________
    mailto:R-devel using r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-devel
    ______________________________________________
    R-devel using r-project.org mailing list
    https://stat.ethz.ch/mailman/listinfo/r-devel
    


More information about the R-devel mailing list