[Rd] Undocumented 'use.names' argument to c()

Martin Maechler maechler at stat.math.ethz.ch
Sun Sep 25 17:14:04 CEST 2016


>>>>> Suharto Anggono Suharto Anggono via R-devel <r-devel at r-project.org>
>>>>>     on Sun, 25 Sep 2016 14:12:10 +0000 writes:

    >> From comments in
    >> http://stackoverflow.com/questions/24815572/why-does-function-c-accept-an-undocumented-argument/24815653
    >> : The code of c() and unlist() was formerly shared but
    >> has been (long time passing) separated. From July 30,
    >> 1998, is where do_c got split into do_c and do_unlist.
    > With the implementation of 'c.Date' in R devel r71350, an
    > argument named 'use.names' is included for
    > concatenation. So, it doesn't follow the documented
    > 'c'. But, 'c.Date' is not explicitly documented in
    > Dates.Rd, that has 'c.Date' as an alias.

I do not see any  c.Date  in R-devel with a 'use.names'; its a
base function, hence not hidden ..

As mentioned before, 'use.names' is used in unlist() in quite a
few places, and such an argument also exists for

    lengths()        	and
    all.equal.list()

and now c() 

    > --------------------------------------------
    > On Sat, 24/9/16, Martin Maechler
    > <maechler at stat.math.ethz.ch> wrote:

    >  Subject: Re: [Rd] Undocumented 'use.names' argument to
    > c() To: "Karl Millar" <kmillar at google.com>

    >  Date: Saturday, 24 September, 2016, 9:12 PM
 
    >>>>>> Karl Millar via R-devel <r-devel at r-project.org>
>>>>> on Fri, 23 Sep 2016 11:12:49 -0700 writes:

    >> I'd expect that a lot of the performance overhead could
    >> be eliminated by simply improving the underlying code.
    >> IMHO, we should ignore it in deciding the API that we
    >> want here.

    > I agree partially.  Even if the underlying code can be
    > made faster, the 'use.names = FALSE' version will still be
    > faster than the default, notably in some "long" cases.

    > More further down.

    >> On Fri, Sep 23, 2016 at 10:54 AM, Henrik Bengtsson
    >> <henrik.bengtsson at gmail.com> wrote:
    >>> I'd vote for it to stay.  It could of course suprise
    >>> someone who'd expect c(list(a=1), b=2, use.names =
    >>> FALSE) to generate list(a=1, b=2, use.names=FALSE).  On
    >>> the upside, is the performance gain from using
    >>> use.names=FALSE.  Below benchmarks show that the
    >>> combining of the names attributes themselves takes
    >>> ~20-25 times longer than the combining of the integers
    >>> themselves.  Also, at no surprise, use.names=FALSE
    >>> avoids some memory allocations.
    >>> 
    >>>> options(digits = 2)
    >>>> 
    >>>> a <- b <- c <- d <- 1:1e4 names(c) <- c names(d) <- d
    >>>> 
    >>>> stats <- microbenchmark::microbenchmark(
    >>> + c(a, b, use.names=FALSE), + c(c, d, use.names=FALSE),
    >>> + c(a, d, use.names=FALSE), + c(a, b, use.names=TRUE), +
    >>> c(a, d, use.names=TRUE), + c(c, d, use.names=TRUE), +
    >>> unit = "ms" + )
    >>>> 
    >>>> stats
    >>> Unit: milliseconds expr min lq mean median uq max neval
    >>> c(a, b, use.names = FALSE) 0.031 0.032 0.049 0.034 0.036
    >>> 1.474 100 c(c, d, use.names = FALSE) 0.031 0.031 0.035
    >>> 0.034 0.035 0.064 100 c(a, d, use.names = FALSE) 0.031
    >>> 0.031 0.049 0.034 0.035 1.452 100 c(a, b, use.names =
    >>> TRUE) 0.031 0.031 0.055 0.034 0.036 2.094 100 c(a, d,
    >>> use.names = TRUE) 0.510 0.526 0.588 0.549 0.617 1.998
    >>> 100 c(c, d, use.names = TRUE) 0.780 0.815 0.886 0.841
    >>> 0.944 1.430 100
    >>> 
    >>>> profmem::profmem(c(c, d, use.names=FALSE))
    >>> Rprofmem memory profiling of: c(c, d, use.names = FALSE)
    >>> 
    >>> Memory allocations: bytes calls 1 80040 <internal> total
    >>> 80040
    >>> 
    >>>> profmem::profmem(c(c, d, use.names=TRUE))
    >>> Rprofmem memory profiling of: c(c, d, use.names = TRUE)
    >>> 
    >>> Memory allocations: bytes calls 1 80040 <internal> 2
    >>> 160040 <internal> total 240080
    >>> 
    >>> /Henrik
    >>> 
    >>> On Fri, Sep 23, 2016 at 10:25 AM, William Dunlap via
    >>> R-devel <r-devel at r-project.org> wrote:
    >>>> In Splus c() and unlist() called the same C code, but
    >>>> with a different 'sys_index' code (the last argument to
    >>>> .Internal) and c() did not consider an argument named
    >>>> 'use.names' special.

    > Thank you, Bill, very much, for making the historical
    > context clear, and giving us the facts, there.

    > OTOH, it is also true in R, that c() and unlist() share
    > code .. quite a bit less though .. but more importantly,
    > the very original C code of Ross Ihaka (and possibly
    > Robert Gentleman) had explicitly considered both extra
    > arguments 'recursive' and 'use.names', and not just the
    > first.

    > The fact that c() has always been a .Primitive function
    > and that these have no formals() had contributed to what I
    > think to be a documentation glitch early on, and when,
    > quite a bit later, we've added a fake argument list for
    > printing, the then current documentation was used.

    > This was the reason for declaring it a documentation
    > "hole" rather than something we do not want.

    > (read on)

    >>>>> c
    >>>> function(..., recursive = F) .Internal(c(..., recursive
    >>>> = recursive), "S_unlist", TRUE, 1)
    >>>>> unlist
    >>>> function(data, recursive = T, use.names = T)
    >>>> .Internal(unlist(data, recursive = recursive, use.names
    >>>> = use.names), "S_unlist", TRUE, 2)
    >>>>> c(A=1,B=2,use.names=FALSE)
    >>>> A B use.names 1 2 0
    >>>> 
    >>>> The C code used sys_index==2 to mean 'the last argument
    >>>> is the 'use.names' argument, if sys_index==1 only the
    >>>> recursive argument was considered special.
    >>>> 
    >>>> Sys.funs.c: 405 S_unlist(vector *ent, vector *arglist,
    >>>> s_evaluator *S_evaluator) 406 { 407 int which =
    >>>> sys_index; boolean named, recursive, names; ...  419
    >>>> args = arglist->value.tree; n = arglist->length; ...
    >>>> 424 names = which==2 ? logical_value(args[--n], ent,
    >>>> S_evaluator) : (which == 1);
    >>>> 
    >>>> Thus there is no historical reason for giving c() the
    >>>> use.names argument.
    >>>> 
    >>>> 
    >>>> Bill Dunlap TIBCO Software wdunlap tibco.com
    >>>> 
    >>>> On Fri, Sep 23, 2016 at 9:37 AM, Suharto Anggono
    >>>> Suharto Anggono via R-devel <r-devel at r-project.org>
    >>>> wrote:
    >>>> 
    >>>>> In S-PLUS 3.4 help on 'c' (http://www.uni-muenster.de/
    >>>>> ZIV.BennoSueselbeck/s-html/helpfiles/c.html), there is
    >>>>> no 'use.names' argument.
    >>>>> 
    >>>>> Because 'c' is a generic function, I don't think that
    >>>>> changing formal arguments is good.
    >>>>> 
    >>>>> In R devel r71344, 'use.names' is not an argument of
    >>>>> functions 'c.Date', 'c.POSIXct' and 'c.difftime'.
    > You are right, Suharto, that methods for c() currently
    > have no such argument.

    > But again because c() is primitive and has a '...' at the
    > beginning, this does not explicitly hurt, currently, does
    > it?

    >>>>> Could 'use.names' be documented to be accepted by the
    >>>>> default method of 'c', but not listed as a formal
    >>>>> argument of 'c'?  Or, could the code that handles the
    >>>>> argument name 'use.names' be removed?

    > In principle, of course both could happen, and if one of
    > these two was preferable to the current state, I'd tend to
    > the first one: Consider 'use.names [= FALSE]' just an
    > argument of the default method for c(), so existing c()
    > methods would not have a strong need for updating.

    > Notably, as the S4 generic for c, via lines 48-49 of
    > src/library/methods/R/BasicFunsList.R

    > , "c" = structure(function(x, ..., recursive = FALSE)
    > standardGeneric("c"), signature="x")

    > has never had 'recursive' as part of the signature..  (and
    > yes, that line 48 does need an update too !!!).

    > Martin


    >>>>> ----------------
    >>>>> >>>>> David Winsemius <dwinsemius at comcast.net>
    >>>>> >>>>> on Tue, 20 Sep 2016 23:46:48 -0700 writes:
    >>>>> 
    >>>>> >> On Sep 20, 2016, at 7:18 PM, Karl Millar via
    >>>>> R-devel <r-devel at
    r-project.org> wrote:
    >>>>> >>
    >>>>> >> 'c' has an undocumented 'use.names' argument.  I'm
    >>>>> not sure if this is >> a documentation or
    >>>>> implementation bug.
    >>>>> 
    >>>>> > It came up on stackoverflow a couple of years ago:
    >>>>> 
    >>>>> >
    >>>>> http://stackoverflow.com/questions/24815572/why-does-
    >>>>> function-c-accept-an-undocumented-argument/24815653#24815653
    >>>>> 
    >>>>> > At the time it appeared to me to be a documentation
    >>>>> lag.
    >>>>> 
    >>>>> Thank you, Karl and David, yes it is a documentation
    >>>>> glitch ... and a bit more: Experts know that
    >>>>> print()ing of primitive functions is, eehm, "special".
    >>>>> 
    >>>>> I've committed a change to R-devel ... (with the
    >>>>> intent to port to R-patched).
    >>>>> 
    >>>>> Martin
    >>>>> 
    >>>>> >>
    >>>>> >>> c(a = 1) >> a >> 1 >>> c(a = 1, use.names = F) >>
    >>>>> [1] 1
    >>>>> >>
    >>>>> >> Karl
    >>>>> 

    > ______________________________________________
    > R-devel at r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list