[R-pkg-devel] tibbles are not data frames

Göran Broström goran.brostrom at umu.se
Tue Sep 26 15:01:19 CEST 2017


Thanks Gábor,

that is OK. However, if I would like an input tibble remain a tibble 
(after massaging) in output, as a courtesy to the user, this will fail. 
I think that it works if I instead treat the input as a list: That's all 
'the tibble way' does (in my case at least).

Göran

On 2017-09-26 14:17, Gábor Csárdi wrote:
> Yes, basically tibbles violate the substitution principle. A lot of
> other packages do, probably base R as well, although it is sometimes
> hard to say, because there is no clear object hierarchy.
> 
> Let's take a step back, and see how you can check for a data frame argument.
> 
> 1. Weak check.
> 
> is.data.frame(arg)
> 
> This essentially means that you trust subclasses of data.frame to
> adhere to the substitution principle. While this is nice in theory, a
> lot packages (including both major packages implementing subclasses of
> data.frame!) do not always adhere. So this is not really a safe
> solution.
> 
> Base R does this as well, sometimes, e.g. aggregate.data.frame has:
> 
>      if (!is.data.frame(x))
>          x <- as.data.frame(x)
> 
> which is essentially equivalent to the weak check, since it leaves
> data.frame subclasses untouched.
> 
> 2. Strong "check".
> 
> arg <- as.data.frame(arg)
> 
> This is safer, because it does not rely on subclass implementors. It
> also has the additional benefit that your code is polymorphic: it
> works with any input, as long as it can be converted to a data frame.
> 
> Base R also uses this often, e.g. in merge.data.frame:
> 
>      nx <- nrow(x <- as.data.frame(x))
>      ny <- nrow(y <- as.data.frame(y))
> 
> Gabor
> 
> Disclaimer: I do not represent the tibble authors in any way.
> 
> On Tue, Sep 26, 2017 at 11:21 AM, David Hugh-Jones
> <davidhughjones at gmail.com> wrote:
>> These replies seem to be missing the point, which is that old code has to be
>> rewritten because tibbles don't behave like data frames.
>>
>> It is true that subclasses can override behaviour, but there is an implicit
>> contract that the same methods should do the same things.
>>
>> The as.xxx pattern seems weird to me, though I see it a lot. What is the
>> point of inheritance if you always have to convert an object upwards before
>> you can treat it as a member of the superclass?
>>
>> I can see this argument will run...
>>
>> David
>>
>> On 26 September 2017 at 11:15, Gábor Csárdi <csardi.gabor at gmail.com> wrote:
>>>
>>> What is the benefit here, compared to just calling as.data.frame() on it?
>>>
>>> Gabor
>>>
>>> On Tue, Sep 26, 2017 at 11:11 AM, Daniel Lüdecke <d.luedecke at uke.de>
>>> wrote:
>>>> Since tibbles add their class attributes first, you could use:
>>>>
>>>> tb <- tibble(a = 5)
>>>> inherits(tb, "data.frame", which = TRUE) == 1
>>>>
>>>> if "tb" is a data frame (only), TRUE is returned, for tibble FALSE. You
>>>> could then coerce to data frame: as.data.frame(tb)
>>>>
>>>> -----Ursprüngliche Nachricht-----
>>>> Von: R-package-devel [mailto:r-package-devel-bounces at r-project.org] Im
>>>> Auftrag von Göran Broström
>>>> Gesendet: Dienstag, 26. September 2017 12:09
>>>> An: r-package-devel at r-project.org
>>>> Betreff: Re: [R-pkg-devel] tibbles are not data frames
>>>>
>>>>
>>>>
>>>> On 2017-09-26 11:56, Gábor Csárdi wrote:
>>>>> On Tue, Sep 26, 2017 at 10:35 AM, Joris Meys <Joris.Meys at ugent.be>
>>>>> wrote:
>>>>>> I don't like the dropping of dimensions either. That doesn't change
>>>>>> the fact that a tibble reacts different from a data.frame. So tibbles
>>>>>> do not inherit correctly from the class data.frame, and it can thus
>>>>>> be argued that it's against OOP paradigms to pretend tibbles inherit
>>>>>> from the class data.frame.
>>>>>
>>>>> I have yet to see an OOP system in which a subclass cannot override
>>>>> the methods of its superclass. Not only is this in line with OOP
>>>>> paradigms, it is actually one of the essential OOP features.
>>>>>
>>>>> To be more constructive, if you have a function that only works with
>>>>> data frame inputs, then it is good practice to check that the supplied
>>>>> input is indeed a data frame. This is independent of tibbles.
>>>>
>>>> It is not. I check input for being a data frame, but tibbles pass that
>>>> test. That's the essence of the problem.
>>>>
>>>>> In practice it seems to me that an easy fix is to just call
>>>>> as.data.frame on the input. This should either convert it to a data
>>>>> frame, or throw an error.
>>>>
>>>> Sure, but I still need to rewrite the package.
>>>>
>>>> Görn
>>>>
>>>>> For tibbles it
>>>>> drops the tbl* classes.
>>>>>
>>>>> Gabor
>>>>>
>>>>>> Defensive coding techniques would check if it's a tibble and return
>>>>>> an error saying a data.frame is expected. Unless tibbles inherit
>>>>>> correctly from data.frame.
>>>>>>
>>>>>> I have nothing against tibbles. But calling them "data.frame" raises
>>>>>> expectations that can't be fulfilled.
>>>>>
>>>>> [...]
>>>>>
>>>>> ______________________________________________
>>>>> R-package-devel at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>>>
>>>>
>>>> ______________________________________________
>>>> R-package-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>>
>>>> --
>>>>
>>>> _____________________________________________________________________
>>>>
>>>> Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen
>>>> Rechts; Gerichtsstand: Hamburg | www.uke.de
>>>> Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Prof. Dr.
>>>> Dr. Uwe Koch-Gromus, Joachim Prölß, Martina Saurin (komm.)
>>>> _____________________________________________________________________
>>>>
>>>> SAVE PAPER - THINK BEFORE PRINTING
>>>> ______________________________________________
>>>> R-package-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>
>>> ______________________________________________
>>> R-package-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>
>>
> 
> ______________________________________________
> R-package-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>



More information about the R-package-devel mailing list