[Rd] as.list method for by Objects

Suharto Anggono Suharto Anggono suharto_anggono at yahoo.com
Sat Feb 3 17:07:54 CET 2018


Maybe behavior of 'as.list' in R is not inherited from S?
- From https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=78 , in "the prototype" (S), 'as.list' on a data frame gave a list, not a data frame as given by the default 'as.list' in R. That led to introduction of 'as.list.data.frame'.
- From https://biostat-lists.wustl.edu/sympa/arc/s-news/1999-07/msg00198.html , with
s <- c("a"=1, "b"=2) ,
as.list(z) doesn't have names in S-PLUS 3.4, different from in R. In S-PLUS 5.1, as.list(z) has names, like in R.

In "Details" section, the documentation, list.Rd, mentions this about 'as.list'.
Attributes may be dropped unless the argument already is a list or expression.  (This is inconsistent with functions such as as.character which always drop attributes, and is for efficiency since lists can be expensive to copy.)

On efficiency issue, shallow copying has been introduced. So, can the behavior of the default method of 'as.list' be reconsidered?

Related: The default mehod of 'as.vector' with mode="list" behaves like the default method of 'as.list'. As a consequence, 'is.vector' with mode="list" on its result may return FALSE. I have raised the issue in https://stat.ethz.ch/pipermail/r-devel/2013-May/066671.html .

------------------------
>>>>> Michael Lawrence <lawrence.michael at gene.com>
>>>>>     on Tue, 30 Jan 2018 15:57:42 -0800 writes:

    > I just meant that the minimal contract for as.list() appears to be that it
    > returns a VECSXP. To the user, we might say that is.list() will always
    > return TRUE.
    
Indeed. I also agree with Herv'e that the user level
documentation should rather mention  is.list(.) |--> TRUE  than
VECSXP, and interestingly for the experts among us,
the  is.list() primitive gives not only TRUE for  VECSXP  but
also of LISTSXP (the good ole' pairlists).

    > I'm not sure we can expect consistency across methods
    > beyond that, nor is it feasible at this point to match the
    > semantics of the methods package. It deals in "class
    > space" while as.list() deals in "typeof() space".

    > Michael

Yes, and that *is* the extra complexity we have in R (inherited
from S, I'd say)  which ideally wasn't there and of course is
not there in much younger languages/systems such as julia.

And --- by the way let me preach, for the "class space" ---
do __never__ use

      if(class(obj) == "<classname>")

in your code (I see this so often, shockingly to me ...) but rather use

      if(inherits(obj, "<classname>"))

instead.

Martin



    > On Tue, Jan 30, 2018 at 3:47 PM, Hervé Pagès <hpages at fredhutch.org> wrote:

    >> On 01/30/2018 02:50 PM, Michael Lawrence wrote:
    >> 
    >>> by() does not always return a list. In Gabe's example, it returns an
    >>> integer, thus it is coerced to a list. as.list() means that it should be a
    >>> VECSXP, not necessarily with "list" in the class attribute.
    >>> 
    >> 
    >> The documentation is not particularly clear about what as.list()
    >> means for list derivatives. IMO clarifications should stick to
    >> simple concepts and formulations like "is.list(x) is TRUE" or
    >> "x is a list or a list derivative" rather than "x is a VECSXP".
    >> Coercion is useful beyond the use case of implementing a .C entry
    >> point and calling as.numeric/as.list/etc... on its arguments.
    >> 
    >> This is why I was hoping that we could maybe discuss the possibility
    >> of making the as.list() contract less vague than just "as.list()
    >> must return a list or a list derivative".
    >> 
    >> Again, I think that 2 things weight quite a lot in that discussion:
    >> 1) as.list() returns an object of class "data.frame" on a
    >> data.frame (strict coercion). If all what as.list() needed to
    >> do was to return a VECSXP, then as.list.default() already does
    >> this on a data.frame so why did someone bother adding an
    >> as.list.data.frame method that does strict coercion?
    >> 2) The S4 coercion system based on as() does strict coercion by
    >> default.
    >> 
    >> H.
    >> 
    >> 
    >>> Michael
    >>> 
    >>> 
    >>> On Tue, Jan 30, 2018 at 2:41 PM, Hervé Pagès <hpages at fredhutch.org
    >>> <mailto:hpages at fredhutch.org>> wrote:
    >>> 
    >>> Hi Gabe,
    >>> 
    >>> Interestingly the behavior of as.list() on by objects seem to
    >>> depend on the object itself:
    >>> 
    >>> > b1 <- by(1:2, 1:2, identity)
    >>> > class(as.list(b1))
    >>> [1] "list"
    >>> 
    >>> > b2 <- by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)
    >>> > class(as.list(b2))
    >>> [1] "by"
    >>> 
    >>> This is with R 3.4.3 and R devel (2017-12-11 r73889).
    >>> 
    >>> H.
    >>> 
    >>> On 01/30/2018 02:33 PM, Gabriel Becker wrote:
    >>> 
    >>> Dario,
    >>> 
    >>> What version of R are you using. In my mildly old 3.4.0
    >>> installation and in the version of Revel I have lying around
    >>> (also mildly old...)  I don't see the behavior I think you are
    >>> describing
    >>> 
    >>> > b = by(1:2, 1:2, identity)
    >>> 
    >>> > class(as.list(b))
    >>> 
    >>> [1] "list"
    >>> 
    >>> > sessionInfo()
    >>> 
    >>> R Under development (unstable) (2017-12-19 r73926)
    >>> 
    >>> Platform: x86_64-apple-darwin15.6.0 (64-bit)
    >>> 
    >>> Running under: OS X El Capitan 10.11.6
    >>> 
    >>> 
    >>> Matrix products: default
    >>> 
    >>> BLAS:
    >>> /Users/beckerg4/local/Rdevel/R
    >>> .framework/Versions/3.5/Resources/lib/libRblas.dylib
    >>> 
    >>> LAPACK:
    >>> /Users/beckerg4/local/Rdevel/R
    >>> .framework/Versions/3.5/Resources/lib/libRlapack.dylib
    >>> 
    >>> 
    >>> locale:
    >>> 
    >>> [1]
    >>> en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
    >>> 
    >>> 
    >>> attached base packages:
    >>> 
    >>> [1] stats     graphics  grDevices utils     datasets
    >>> methods   base
    >>> 
    >>> 
    >>> loaded via a namespace (and not attached):
    >>> 
    >>> [1] compiler_3.5.0
    >>> 
    >>> >
    >>> 
    >>> 
    >>> As for by not having a class definition, no S3 class has an
    >>> explicit definition, so this is somewhat par for the course
    >>> here...
    >>> 
    >>> did I misunderstand something?
    >>> 
    >>> 
    >>> ~G
    >>> 
    >>> On Tue, Jan 30, 2018 at 2:24 PM, Hervé Pagès
    >>> <hpages at fredhutch.org <mailto:hpages at fredhutch.org>
    >>> <mailto:hpages at fredhutch.org <mailto:hpages at fredhutch.org>>>
    >>> wrote:
    >>> 
    >>> I agree that it makes sense to expect as.list() to perform
    >>> a "strict coercion" i.e. to return an object of class "list",
    >>> *even* on a list derivative. That's what as( , "list") does
    >>> by default:
    >>> 
    >>> # on a data.frame object
    >>> as(data.frame(), "list")  # object of class "list"
    >>> # (but strangely it drops the
    >>> names)
    >>> 
    >>> # on a by object
    >>> x <- by(warpbreaks[, 1:2], warpbreaks[,"tension"],
    >>> summary)
    >>> as(x, "list")  # object of class "list"
    >>> 
    >>> More generally speaking as() is expected to perform "strict
    >>> coercion" by default, unless called with 'strict=FALSE'.
    >>> 
    >>> That's also what as.list() does on a data.frame:
    >>> 
    >>> as.list(data.frame())  # object of class "list"
    >>> 
    >>> FWIW as.numeric() also performs "strict coercion" on an
    >>> integer
    >>> vector:
    >>> 
    >>> as.numeric(1:3)  # object of class "numeric"
    >>> 
    >>> So an as.list.env method that does the same as as(x, "list")
    >>> would bring a small touch of consistency in an otherwise
    >>> quite inconsistent world of coercion methods(*).
    >>> 
    >>> H.
    >>> 
    >>> (*) as(data.frame(), "list", strict=FALSE) doesn't do what
    >>> you'd
    >>> expect (just one of many examples)
    >>> 
    >>> 
    >>> On 01/29/2018 05:00 PM, Dario Strbenac wrote:
    >>> 
    >>> Good day,
    >>> 
    >>> I'd like to suggest the addition of an as.list method
    >>> for a by
    >>> object that actually returns a list of class "list".
    >>> This would
    >>> make it safer to do type-checking, because is.list also
    >>> returns
    >>> TRUE for a data.frame variable and using class(result)
    >>> == "list"
    >>> is an alternative that only returns TRUE for lists.
    >>> It's also
    >>> confusing initially that
    >>> 
    >>> class(x)
    >>> 
    >>> [1] "by"
    >>> 
    >>> is.list(x)
    >>> 
    >>> [1] TRUE
    >>> 
    >>> since there's no explicit class definition for "by" and no
    >>> mention if it has any superclasses.
    >>> 
    >>> --------------------------------------
    >>> Dario Strbenac
    >>> University of Sydney
    >>> Camperdown NSW 2050
    >>> Australia

    .............

    >>> --         Gabriel Becker, PhD
    >>> Scientist (Bioinformatics)
    >>> Genentech Research
    >>> 

    >> Hervé Pagès
    >> 
    >> Program in Computational Biology
    >> Division of Public Health Sciences
    >> Fred Hutchinson Cancer Research Center
    >> 1100 Fairview Ave. N, M1-B514
    >> P.O. Box 19024
    >> Seattle, WA 98109-1024
    >> 
    >> E-mail: hpages at fredhutch.org
    >> Phone:  (206) 667-5791
    >> Fax:    (206) 667-1319



More information about the R-devel mailing list