[R-pkg-devel] tibbles are not data frames

CJ Yetman cj at cjyetman.com
Tue Sep 26 13:45:49 CEST 2017


The problem is not with a data.frame or a tibble... the problem is when a
package unwittingly converts a data.frame/tibble to a vector, because of
bad defaults in data.frame methods, and then later on expects that vector
to be a vector without explicitly making it a vector or checking if it is a
vector.

On Tue, Sep 26, 2017 at 1:10 PM, Alexandre Courtiol <
alexandre.courtiol at gmail.com> wrote:

> David is right,
>
> imagine an old silly code such as:
>
> get_a.data.frame <- function(d) if("data.frame" %in% class(d)) d["a" ,]
>
> This line of code giving you the row "a" of a data.frame could be in any
> package.
> No matter how ugly it is, it is technically correct and conforms to the
> original definition of data.frames.
>
> Now you have a data.frame:
>
> foo <- data.frame(x=1:3, row.names = c("a", "b", "c"))
>
> > geta.data.frame(foo)
> [1] 1
>
> this is expected
>
> > geta.data.frame(as.matrix(foo))
> [1]
>
> This returns nothing, again it is expected as a matrix is not a data.frame
>
> But here comes the tibble trouble:
>
> > get_a.data.frame(as.tibble(foo))
> # A tibble: 1 x 1
>       x
>   <int>
> 1    NA
>
> And now the old package is broken.
> Also if we tolerate this, think what would happen if this kind of practice
> would scale up!
> If anyone can call any classes the way they want without fulfilling the law
> of inheritance we will soon be in big troubles, lost among the mutants.
>
> Tibbles are great, data.frame are widely used, Tibbles should not be of the
> class data.frame, unless tibbles start behaving as data.frame do.
>
> Alex
>
>
>
>
>
>
>
>
>
> On 26 September 2017 at 12:21, Stefan McKinnon Høj-Edwards <sme at iysik.com>
> wrote:
>
> > There is no benefit. It is a rather cumbersome approach to checking
> whether
> > something behaves as you expect it to. `as.data.frame` will force it into
> > what you need; if it cannot be forced, then it will fail. That it can be
> > converted to a data.frame is the class' designers responsibility, not
> > yours. So you can use `as.data.frame` on *any* input that you need to
> > behave as a data.frame.
> > Consider a grouped tribble; now you have to test 2 different classes.
> >
> > Kindly,
> > Stefan
> >
> > Stefan McKinnon Høj-Edwards
> > ph.d. Genetics
> > +44 (0)776 231 2464
> > +45 2888 6598
> > Skype: stefan_edwards
> >
> > 2017-09-26 11:15 GMT+01:00 Gábor Csárdi <csardi.gabor at gmail.com>:
> >
> > > What is the benefit here, compared to just calling as.data.frame() on
> it?
> > >
> > > Gabor
> > >
> > > On Tue, Sep 26, 2017 at 11:11 AM, Daniel Lüdecke <d.luedecke at uke.de>
> > > wrote:
> > > > Since tibbles add their class attributes first, you could use:
> > > >
> > > > tb <- tibble(a = 5)
> > > > inherits(tb, "data.frame", which = TRUE) == 1
> > > >
> > > > if "tb" is a data frame (only), TRUE is returned, for tibble FALSE.
> You
> > > could then coerce to data frame: as.data.frame(tb)
> > > >
> > > > -----Ursprüngliche Nachricht-----
> > > > Von: R-package-devel [mailto:r-package-devel-bounces at r-project.org]
> Im
> > > Auftrag von Göran Broström
> > > > Gesendet: Dienstag, 26. September 2017 12:09
> > > > An: r-package-devel at r-project.org
> > > > Betreff: Re: [R-pkg-devel] tibbles are not data frames
> > > >
> > > >
> > > >
> > > > On 2017-09-26 11:56, Gábor Csárdi wrote:
> > > >> On Tue, Sep 26, 2017 at 10:35 AM, Joris Meys <Joris.Meys at ugent.be>
> > > wrote:
> > > >>> I don't like the dropping of dimensions either. That doesn't change
> > > >>> the fact that a tibble reacts different from a data.frame. So
> tibbles
> > > >>> do not inherit correctly from the class data.frame, and it can thus
> > > >>> be argued that it's against OOP paradigms to pretend tibbles
> inherit
> > > >>> from the class data.frame.
> > > >>
> > > >> I have yet to see an OOP system in which a subclass cannot override
> > > >> the methods of its superclass. Not only is this in line with OOP
> > > >> paradigms, it is actually one of the essential OOP features.
> > > >>
> > > >> To be more constructive, if you have a function that only works with
> > > >> data frame inputs, then it is good practice to check that the
> supplied
> > > >> input is indeed a data frame. This is independent of tibbles.
> > > >
> > > > It is not. I check input for being a data frame, but tibbles pass
> that
> > > test. That's the essence of the problem.
> > > >
> > > >> In practice it seems to me that an easy fix is to just call
> > > >> as.data.frame on the input. This should either convert it to a data
> > > >> frame, or throw an error.
> > > >
> > > > Sure, but I still need to rewrite the package.
> > > >
> > > > Görn
> > > >
> > > >> For tibbles it
> > > >> drops the tbl* classes.
> > > >>
> > > >> Gabor
> > > >>
> > > >>> Defensive coding techniques would check if it's a tibble and return
> > > >>> an error saying a data.frame is expected. Unless tibbles inherit
> > > >>> correctly from data.frame.
> > > >>>
> > > >>> I have nothing against tibbles. But calling them "data.frame"
> raises
> > > >>> expectations that can't be fulfilled.
> > > >>
> > > >> [...]
> > > >>
> > > >> ______________________________________________
> > > >> R-package-devel at r-project.org mailing list
> > > >> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> > > >>
> > > >
> > > > ______________________________________________
> > > > R-package-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/
> > > listinfo/r-package-devel
> > > >
> > > > --
> > > >
> > > > ____________________________________________________________
> _________
> > > >
> > > > Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen
> > > Rechts; Gerichtsstand: Hamburg | www.uke.de
> > > > Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Prof.
> Dr.
> > > Dr. Uwe Koch-Gromus, Joachim Prölß, Martina Saurin (komm.)
> > > > ____________________________________________________________
> _________
> > > >
> > > > SAVE PAPER - THINK BEFORE PRINTING
> > > > ______________________________________________
> > > > R-package-devel at r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-package-devel
> > >
> > > ______________________________________________
> > > R-package-devel at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-package-devel
> > >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-package-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >
>
>
>
> --
> Alexandre Courtiol
>
> http://sites.google.com/site/alexandrecourtiol/home
>
> *"Science is the belief in the ignorance of experts"*, R. Feynman
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-package-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list