[R-pkg-devel] tibbles are not data frames

Pedro J. Aphalo pedro.aphalo at helsinki.fi
Tue Sep 26 15:51:24 CEST 2017


What I think is troublesome is that data.frame is part of the definition 
of the R language, and the expectation based on R's normal behaviour is 
that testing with is.data.frame() should be enough to ensure that an 
object can be treated as a data frame. We can think of different 
solutions for use in our packages, but the naive R user will be always 
surprised by the behaviour of tibbles because package 'tibble' breaks 
the expectations of the R language with an exception.

I do not know what could be the best solution... though. Maybe thinking 
of tibbles as a step towards R++ or R 4 or whatever future enhanced 
version of R, in which they will replace data frames completely. Hadley 
is correct in that they are a very significant improvement to R, but the 
problem is the inconsistent behaviour.

Pedro.


On 2017-09-26 16:01, Göran Broström wrote:
> Thanks Gábor,
>
> that is OK. However, if I would like an input tibble remain a tibble 
> (after massaging) in output, as a courtesy to the user, this will 
> fail. I think that it works if I instead treat the input as a list: 
> That's all 'the tibble way' does (in my case at least).
>
> Göran
>
> On 2017-09-26 14:17, Gábor Csárdi wrote:
>> Yes, basically tibbles violate the substitution principle. A lot of
>> other packages do, probably base R as well, although it is sometimes
>> hard to say, because there is no clear object hierarchy.
>>
>> Let's take a step back, and see how you can check for a data frame 
>> argument.
>>
>> 1. Weak check.
>>
>> is.data.frame(arg)
>>
>> This essentially means that you trust subclasses of data.frame to
>> adhere to the substitution principle. While this is nice in theory, a
>> lot packages (including both major packages implementing subclasses of
>> data.frame!) do not always adhere. So this is not really a safe
>> solution.
>>
>> Base R does this as well, sometimes, e.g. aggregate.data.frame has:
>>
>>      if (!is.data.frame(x))
>>          x <- as.data.frame(x)
>>
>> which is essentially equivalent to the weak check, since it leaves
>> data.frame subclasses untouched.
>>
>> 2. Strong "check".
>>
>> arg <- as.data.frame(arg)
>>
>> This is safer, because it does not rely on subclass implementors. It
>> also has the additional benefit that your code is polymorphic: it
>> works with any input, as long as it can be converted to a data frame.
>>
>> Base R also uses this often, e.g. in merge.data.frame:
>>
>>      nx <- nrow(x <- as.data.frame(x))
>>      ny <- nrow(y <- as.data.frame(y))
>>
>> Gabor
>>
>> Disclaimer: I do not represent the tibble authors in any way.
>>
>> On Tue, Sep 26, 2017 at 11:21 AM, David Hugh-Jones
>> <davidhughjones at gmail.com> wrote:
>>> These replies seem to be missing the point, which is that old code 
>>> has to be
>>> rewritten because tibbles don't behave like data frames.
>>>
>>> It is true that subclasses can override behaviour, but there is an 
>>> implicit
>>> contract that the same methods should do the same things.
>>>
>>> The as.xxx pattern seems weird to me, though I see it a lot. What is 
>>> the
>>> point of inheritance if you always have to convert an object upwards 
>>> before
>>> you can treat it as a member of the superclass?
>>>
>>> I can see this argument will run...
>>>
>>> David
>>>
>>> On 26 September 2017 at 11:15, Gábor Csárdi <csardi.gabor at gmail.com> 
>>> wrote:
>>>>
>>>> What is the benefit here, compared to just calling as.data.frame() 
>>>> on it?
>>>>
>>>> Gabor
>>>>
>>>> On Tue, Sep 26, 2017 at 11:11 AM, Daniel Lüdecke <d.luedecke at uke.de>
>>>> wrote:
>>>>> Since tibbles add their class attributes first, you could use:
>>>>>
>>>>> tb <- tibble(a = 5)
>>>>> inherits(tb, "data.frame", which = TRUE) == 1
>>>>>
>>>>> if "tb" is a data frame (only), TRUE is returned, for tibble 
>>>>> FALSE. You
>>>>> could then coerce to data frame: as.data.frame(tb)
>>>>>
>>>>> -----Ursprüngliche Nachricht-----
>>>>> Von: R-package-devel 
>>>>> [mailto:r-package-devel-bounces at r-project.org] Im
>>>>> Auftrag von Göran Broström
>>>>> Gesendet: Dienstag, 26. September 2017 12:09
>>>>> An: r-package-devel at r-project.org
>>>>> Betreff: Re: [R-pkg-devel] tibbles are not data frames
>>>>>
>>>>>
>>>>>
>>>>> On 2017-09-26 11:56, Gábor Csárdi wrote:
>>>>>> On Tue, Sep 26, 2017 at 10:35 AM, Joris Meys <Joris.Meys at ugent.be>
>>>>>> wrote:
>>>>>>> I don't like the dropping of dimensions either. That doesn't change
>>>>>>> the fact that a tibble reacts different from a data.frame. So 
>>>>>>> tibbles
>>>>>>> do not inherit correctly from the class data.frame, and it can thus
>>>>>>> be argued that it's against OOP paradigms to pretend tibbles 
>>>>>>> inherit
>>>>>>> from the class data.frame.
>>>>>>
>>>>>> I have yet to see an OOP system in which a subclass cannot override
>>>>>> the methods of its superclass. Not only is this in line with OOP
>>>>>> paradigms, it is actually one of the essential OOP features.
>>>>>>
>>>>>> To be more constructive, if you have a function that only works with
>>>>>> data frame inputs, then it is good practice to check that the 
>>>>>> supplied
>>>>>> input is indeed a data frame. This is independent of tibbles.
>>>>>
>>>>> It is not. I check input for being a data frame, but tibbles pass 
>>>>> that
>>>>> test. That's the essence of the problem.
>>>>>
>>>>>> In practice it seems to me that an easy fix is to just call
>>>>>> as.data.frame on the input. This should either convert it to a data
>>>>>> frame, or throw an error.
>>>>>
>>>>> Sure, but I still need to rewrite the package.
>>>>>
>>>>> Görn
>>>>>
>>>>>> For tibbles it
>>>>>> drops the tbl* classes.
>>>>>>
>>>>>> Gabor
>>>>>>
>>>>>>> Defensive coding techniques would check if it's a tibble and return
>>>>>>> an error saying a data.frame is expected. Unless tibbles inherit
>>>>>>> correctly from data.frame.
>>>>>>>
>>>>>>> I have nothing against tibbles. But calling them "data.frame" 
>>>>>>> raises
>>>>>>> expectations that can't be fulfilled.
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-package-devel at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>>>>
>>>>>
>>>>> ______________________________________________
>>>>> R-package-devel at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>>>
>>>>> -- 
>>>>>
>>>>> _____________________________________________________________________
>>>>>
>>>>> Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen
>>>>> Rechts; Gerichtsstand: Hamburg | www.uke.de
>>>>> Vorstandsmitglieder: Prof. Dr. Burkhard Göke (Vorsitzender), Prof. 
>>>>> Dr.
>>>>> Dr. Uwe Koch-Gromus, Joachim Prölß, Martina Saurin (komm.)
>>>>> _____________________________________________________________________
>>>>>
>>>>> SAVE PAPER - THINK BEFORE PRINTING
>>>>> ______________________________________________
>>>>> R-package-devel at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>>
>>>> ______________________________________________
>>>> R-package-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>
>>>
>>
>> ______________________________________________
>> R-package-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>
>
> ______________________________________________
> R-package-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

-- 
------------------------------------------------------------------------
Pedro J. Aphalo
University Lecturer, Principal Investigator
(Office 4417, Biocenter 3, Viikinkaari 1)

Department of Biosciences
Plant Biology
P.O. Box 65
00014 University of Helsinki
Finland

e-mail: pedro.aphalo at helsinki.fi <mailto:pedro.aphalo at helsinki.fi>
Tel. (mobile) +358 50 4150623
Tel. (office) +358 2941 57897

------------------------------------------------------------------------
*Web sites and blogs*
Web site (research group): http://blogs.helsinki.fi/senpep-blog/
Web site (own teaching): http://www.helsinki.fi/people/pedro.aphalo/
Web site (using R in photobiology): http://www.r4photobiology.info/
------------------------------------------------------------------------
*Societies*
UV4Plants <http://www.uv4plants.org/> (communications officer), ESP 
<http://www.photobiology.eu/> (member) SEB <http://www.sebiology.org/> 
(member), BES <http://www.britishecologicalsociety.org/> (member), SPPS 
<http://www.spps.fi/> (member), SMS 
<http://www.metsatieteellinenseura.fi/english> (member), TUG 
<http://tug.org/> (member), FOAS <http://www.foastat.org/> (member).
------------------------------------------------------------------------

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list