[Rd] named arguments discouraged in `[.data.frame` and `[<-.data.frame`

Henrik Pärn henrik@p@rn @ending from ntnu@no
Thu Nov 29 11:05:17 CET 2018


Thanks Emil for your thorough answer. It really clarified a lot to me. And revealed (even more of) my ignorance.

>-----Original Message-----
>From: Emil Bode <emil.bode using dans.knaw.nl>
>Sent: Thursday, November 29, 2018 10:48 AM
>To: Henrik Pärn <henrik.parn using ntnu.no>; r-devel using r-project.org
>Subject: Re: [Rd] named arguments discouraged in `[.data.frame` and `[<-
>.data.frame`
>
>Well, the situation with `[.data.frame` (and [<-) is complicated by the fact that
>the data.frame-method is not a primitive, but the generic IS.
>I'm not sure about dispatch for primitive-generics, but I bet it's done on the
>first argument (as with S3). Which means `[`(j=1:2,d,i=1) has nothing to do
>with `[.data.frame`, as some internal code equivalent to something like
>`[.integer` is called (`[.integer` is not an R-function, but I guess it's
>implemented in the C-code for `[`)
>And note that `[.data.frame`(j=1:2,d,i=1) does work (throws a warning, but
>returns the right result), because then you're simply calling the direct R-
>function, and matching by name is done.
>
>But I think the main reason for the warning is forwards compatibility (and
>maybe backwards?). As of this version, `[.data.frame`(x = d, j = 2, i = 1) works
>fine, and `[.data.frame` is a regular R-function. But it's used a lot, I wouldn't be
>surprised if some future R-version would implement it as a primitive.
>Without the warning, implementing [.data.frame as a primitive would involve
>a LOT of issues where older code breaks. With the warning, we can make clear
>to any users that calls like this one are undefined. They may work for now, but
>one shouldn't rely on it. Which means only the "right" order may be used, and
>then naming them is superfluous.
>
>By the way, when trying some things I noticed something else, which I'll send
>a separate mail about...
>
>Cheers,
>Emil
>
>On 29/11/2018, 09:20, "R-devel on behalf of Henrik Pärn" <r-devel-
>bounces using r-project.org on behalf of henrik.parn using ntnu.no> wrote:
>
>    Thanks Bill and Michael for taking the time to share your knowledge!
>
>    As a further background to my question, here are two examples that I
>forgot to include in my original post (reminded by Michael's answer). I
>swapped the i and j arguments in `[.data.frame` and `[<-.data.frame`. With
>warnings, but else without (?) problem. Using Bill's data:
>
>    `[.data.frame`(x = d, i = 1, j = 2)
>    # [1] 12
>
>    `[.data.frame`(x = d, j = 2, i = 1)
>    # [1] 12
>
>    And similar for `[<-.data.frame` :
>    `[<-.data.frame`(x = d, i = 1, j = 2, value = 1122)
>    `[<-.data.frame`(x = d, j = 2, i = 1, value = 12)
>
>    Because this seemed to work, I made the hasty conclusion that argument
>switching _wasn't_ a problem for `[.data frame`, and that we could rely on
>exact matching on tags. But apparently not: despite that `[.data.frame` and
>`[<-.data.frame` are _not_ primitive functions, positional matching is done
>there as well. Sometimes. At least when 'x' argument is not first, as shown in
>Bill's examples. Obviously my "test" was insufficient...
>
>    Cheers,
>
>    Henrik
>
>
>
>    From: William Dunlap <wdunlap using tibco.com>
>    Sent: Wednesday, November 28, 2018 9:10 PM
>    To: Henrik Pärn <henrik.parn using ntnu.no>
>    Cc: r-devel using r-project.org
>    Subject: Re: [Rd] named arguments discouraged in `[.data.frame` and `[<-
>.data.frame`
>
>    They can get bitten in the last two lines of this example, where the 'x'
>argument is not first:
>    > d <- data.frame(C1=c(r1=11,r2=21,r3=31), C2=c(12,22,32))
>    > d[1,1:2]
>       C1 C2
>    r1 11 12
>    > `[`(d,j=1:2,i=1)
>       C1 C2
>    r1 11 12
>    Warning message:
>    In `[.data.frame`(d, j = 1:2, i = 1) :
>      named arguments other than 'drop' are discouraged
>    > `[`(j=1:2,d,i=1)
>    Error in (1:2)[d, i = 1] : incorrect number of dimensions
>    > do.call("[", list(j=1:2, i=1, x=d))
>    Error in 1:2[i = 1, x = list(C1 = c(11, 21, 31), C2 = c(12, 22, 32))] :
>      incorrect number of dimensions
>
>    Bill Dunlap
>    TIBCO Software
>    wdunlap http://tibco.com
>
>
>    On Wed, Nov 28, 2018 at 11:30 AM Henrik Pärn
><mailto:henrik.parn using ntnu.no> wrote:
>    tl;dr:
>
>    Why are named arguments discouraged in `[.data.frame`, `[<-.data.frame`
>and `[[.data.frame`?
>
>    (because this question is of the kind 'why is R designed like this?', I though
>R-devel would be more appropriate than R-help)
>
>    #############################
>
>    Background:
>
>    Now and then students presents there fancy functions like this:
>
>    myfancyfun(d,12,0.3,0.2,500,1000,FALSE,TRUE,FALSE,TRUE,FALSE)
>
>    Incomprehensible. Thus, I encourage them to use spaces and name
>arguments, _at least_ when trying to communicate their code with others.
>Something like:
>
>    myfancyfun(data = d, n = 12, gamma = 0.3, prob = 0.2,
>                          size = 500, niter = 1000, model = FALSE,
>                         scale = TRUE, drop = FALSE, plot = TRUE, save = FALSE)
>
>
>    Then some overzealous students started to use named arguments
>everywhere. E-v-e-r-y-w-h-e-r-e. Even in the most basic situation when
>indexing vectors (as a subtle protest?), like:
>
>    vec <- 1:9
>
>    vec[i = 4]
>    `[`(x = vec, i = 4)
>
>    vec[[i = 4]]
>    `[[`(x = vec, i = 4)
>
>    vec[i = 4] <- 10
>    `[<-`(x = vec, i = 4, value = 10)
>
>    ...or when indexing matrices:
>
>    m <- matrix(vec, ncol = 3)
>    m[i = 2, j = 2]
>    `[`(x = m, i = 2, j = 2)
>    # 5
>
>    m[i = 2, j = 2] <- 0
>    `[<-`(x = m, i = 2, j = 2, value = 0)
>
>    ######
>
>    This practice indeed feels like overkill, but it didn't seem to hurt either. Until
>they used it on data frames. Then suddenly warnings appeared that named
>arguments are discouraged:
>
>    d <- data.frame(m)
>
>    d[[i = "X2"]]
>    # [1] 4 5 6
>    # Warning message:
>    # In `[[.data.frame`(d, i = "X2") :
>    #  named arguments other than 'exact' are discouraged
>
>    d[i = 2, j = 2]
>    # [1] 0
>    # Warning message:
>    # In `[.data.frame`(d, i = 2, j = 2) :
>    #  named arguments other than 'drop' are discouraged
>
>    d[i = 2, j = 2] <- 5
>    # Warning message:
>    # In `[<-.data.frame`(`*tmp*`, i = 2, j = 2, value = 5) :
>    #  named arguments are discouraged
>
>
>    ##################################
>
>    Of course I could tell them "don't do it, it's overkill and not common
>practice" or "it's just a warning, don't worry". However, I assume the warnings
>are there for a good reason.
>
>    So how do I explain to the students that named arguments are actively
>discouraged in `[.data.frame` and `[<-.data.frame`, but not in `[` and `[<-`?
>When will they get bitten?
>
>    ______________________________________________
>    mailto:R-devel using r-project.org mailing list
>    https://stat.ethz.ch/mailman/listinfo/r-devel
>    ______________________________________________
>    R-devel using r-project.org mailing list
>    https://stat.ethz.ch/mailman/listinfo/r-devel
>


More information about the R-devel mailing list