[R-pkg-devel] tibbles are not data frames

Holger Hoefling hhoeflin at gmail.com
Tue Sep 26 13:41:11 CEST 2017


Hi Thierry,

You write:

"If a package requires a data.frame, then it is up to the _user_ to
provide a data.frame (and a tibble is not a data.frame). "

Actually, as pointed out before, calling

is.data.frame

on a tibble returns TRUE. So I think that R says - yes, a tibble is a data
frame. What would be the point of having a "is.data.frame" function, if you
can't trust its answer?

And you can also look at it from the other side: Why does tibble need to
inherit from a data.frame? I don't know exactly what the original intention
behind this was, but I would guess that it was intended to make tibbles a
drop-in replacement for data.frames. And it looks like it is not succeeding
at this task.

Best

Holger Hoefling

On Tue, Sep 26, 2017 at 1:32 PM, Thierry Onkelinx <thierry.onkelinx at inbo.be>
wrote:

> Dear all,
>
> IMHO the problem is being look at from the wrong perspective. The
> tibble doesn't change the data.frame, it uses all methods from
> data.frame which it doesn't implement itself. Hence it behaves like at
> data.frame to some extent.
>
> If a package requires a data.frame, then it is up to the _user_ to
> provide a data.frame (and a tibble is not a data.frame). Documenting
> this in the package documentation/FAQ or issuing a warning "don't use
> tibble" when the package is loaded should be sufficient.
>
> Best regards,
>
> ir. Thierry Onkelinx
> Statisticus/ Statistician
>
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
> AND FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> thierry.onkelinx at inbo.be
> Kliniekstraat 25, B-1070 Brussel
> www.inbo.be
>
> ////////////////////////////////////////////////////////////
> ///////////////////////////////
> To call in the statistician after the experiment is done may be no
> more than asking him to perform a post-mortem examination: he may be
> able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does
> not ensure that a reasonable answer can be extracted from a given body
> of data. ~ John Tukey
> ////////////////////////////////////////////////////////////
> ///////////////////////////////
>
>
> Van 14 tot en met 19 december 2017 verhuizen we uit onze vestiging in
> Brussel naar het Herman Teirlinckgebouw op de site Thurn & Taxis.
> Vanaf dan ben je welkom op het nieuwe adres: Havenlaan 88 bus 73, 1000
> Brussel.
>
> ////////////////////////////////////////////////////////////
> ///////////////////////////////
>
>
> 2017-09-26 13:18 GMT+02:00 Joris Meys <Joris.Meys at ugent.be>:
> >
> > On Tue, Sep 26, 2017 at 11:56 AM, Gábor Csárdi <csardi.gabor at gmail.com>
> > wrote:
> >
> > >
> > > I have yet to see an OOP system in which a subclass cannot override the
> > > methods
> > > of its superclass. Not only is this in line with OOP paradigms, it is
> > > actually one of
> > > the essential OOP features.
> > >
> >
> > Fair enough. And I shouldn't have used the word "inherit" in the first
> > place, we're talking S3 after all. Fwiw, overriding a method to do the
> > exact same except for one detail isn't encouraged in the OOP world
> either.
> >
> >
> > > To be more constructive, if you have a function that only works with
> > > data frame inputs, then
> > > it is good practice to check that the supplied input is indeed a data
> > > frame. This is
> > > independent of tibbles.
> > >
> >
> > Actually it's not independent of tibbles as illustrated by others.
> > is.data.frame() returns TRUE for tibbles. It doesn't for matrices or
> > vectors.
> >
> >
> > >
> > > In practice it seems to me that an easy fix is to just call
> > > as.data.frame on the input. This should
> > > either convert it to a data frame, or throw an error. For tibbles it
> > > drops the tbl* classes.
> > >
> >
> > This would also allow matrices or vectors to be converted to data.frames,
> > and that might or might not be warranted.
> >
> > I agree that the S3 system allows you to do this, and think it's up to
> the
> > package manager to decide whether or not they would allow their users to
> > use tibbles instead of data.frame objects.
> >
> > I think the bigger frustration is that tibble users are more prone to
> > expect all code to work exactly like it does with data.frames. Which it
> > obviously doesn't.
> >
> > --
> > Joris Meys
> > Statistical consultant
> >
> > Ghent University
> > Faculty of Bioscience Engineering
> > Department of Mathematical Modelling, Statistics and Bio-Informatics
> >
> > tel : +32 9 264 59 87
> > Joris.Meys at Ugent.be
> > -------------------------------
> > Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-package-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
> ______________________________________________
> R-package-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list