[R-pkg-devel] tibbles are not data frames

Joris Meys Joris.Meys at ugent.be
Tue Sep 26 11:35:02 CEST 2017


I don't like the dropping of dimensions either. That doesn't change the
fact that a tibble reacts different from a data.frame. So tibbles do not
inherit correctly from the class data.frame, and it can thus be argued that
it's against OOP paradigms to pretend tibbles inherit from the class
data.frame. Defensive coding techniques would check if it's a tibble and
return an error saying a data.frame is expected. Unless tibbles inherit
correctly from data.frame.

I have nothing against tibbles. But calling them "data.frame" raises
expectations that can't be fulfilled.


On Tue, Sep 26, 2017 at 11:23 AM, Stefan McKinnon Høj-Edwards <sme at iysik.com
> wrote:

> Thanks for the examples. Personally, I have been struck out multiple times
> by data frames dropping dimensions, so I have a distaste for this dropping
> behaviour.
>
> Personally, I prefer data frame *not* to drop dimensions. They are not
> arrays, where slicing drops a dimension makes sense because all entries are
> same data type.
> You can pull out a column in vector form from both tribbles and data frame
> with the $ index; subsetting a row from a data frame and forcing it into an
> atomic vector will require cast all columns to lowest common denominator,
> often character.
>
> So I would argue that yes, tribbles are data.frame with extra bells and
> whistles, even if I do not understand the use of list columns.
>
> I suggest a defensive coding technique; if you need a data frame subset to
> really be a vector, cast it as a vector. Users *will* attempt to throw
> unexpected structures at your methods. When your methods fails in
> mysterious ways because it didn't extract a vector, users will be
> stupefied. Fail at `as.vector` will indicate why.
>
> Kindly,
> Stefan
>
> Stefan McKinnon Høj-Edwards
> ph.d. Genetics
> +44 (0)776 231 2464 <+44%207762%20312464>
> +45 2888 6598 <+45%2028%2088%2065%2098>
> Skype: stefan_edwards
>
> 2017-09-26 10:05 GMT+01:00 Joris Meys <Joris.Meys at ugent.be>:
>
>> Here's one difference:
>>
>> atib <- tibble(a = 1:5, b = letters[5:1])
>> atib[3,"a"]
>> as.data.frame(atib)[3,"a"]
>>
>> The second line returns a tibble (no dropping dimensions), the third line
>> does (dropping dimensions). Huge difference if you use [ , aColumn] to
>> select a vector from a data frame.
>>
>> Cheers
>> Joris
>>
>> On Tue, Sep 26, 2017 at 10:57 AM, Stefan McKinnon Høj-Edwards <
>> sme at iysik.com> wrote:
>>
>>> Hi Göran,
>>>
>>> Could you please elaborate on which kind of subsetting that Hadley
>>> dislikes?
>>> I am yet to encounter operations on data frames that are not possible on
>>> tribbles.
>>>
>>> Kindly,
>>> Stefan McKinnon Hoj-Edwards
>>>
>>> Stefan McKinnon Høj-Edwards
>>> ph.d. Genetics
>>> +44 (0)776 231 2464
>>> +45 2888 6598
>>> Skype: stefan_edwards
>>>
>>> 2017-09-26 8:30 GMT+01:00 Göran Broström <goran.brostrom at umu.se>:
>>>
>>> > I am beginning to get complaints from users of my CRAN packages
>>> > (especially 'eha') to the effect that they get error messages like
>>> "Error:
>>> > Unsupported use of matrix or array for column indexing".
>>> >
>>> > It turns out that they are sticking in tibbles into functions that
>>> expect
>>> > data frames as input. And I am using the kind of subsetting that Hadley
>>> > dislikes (eha is an old package, much older than tibbles). It is of
>>> course
>>> > a simple matter to change the code so it handles both data frames and
>>> > tibbles correctly, but this affects many functions, and it will take
>>> some
>>> > time. And when the next guy introduces 'troubles' as an improvement of
>>> > 'tibbles', I will have to rewrite the code again.
>>> >
>>> > While I like Hadley's way of doing it, I think it is a mistake to let a
>>> > tibble also be of class data frame. To me it is a matter of
>>> inheritance and
>>> > backwards compability: A tibble should add nice things to a data
>>> frame, not
>>> > change basic behaviour, in order to call itself a data frame.
>>> >
>>> > Is it correct to let a tibble be of class "data.frame"?
>>> >
>>> > Göran Broström
>>> >
>>> > ______________________________________________
>>> > R-package-devel at r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-package-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>
>>
>>
>>
>> --
>> Joris Meys
>> Statistical consultant
>>
>> Ghent University
>> Faculty of Bioscience Engineering
>> Department of Mathematical Modelling, Statistics and Bio-Informatics
>>
>> tel : +32 9 264 59 87 <+32%209%20264%2059%2087>
>> Joris.Meys at Ugent.be
>> -------------------------------
>> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>>
>
>


-- 
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Mathematical Modelling, Statistics and Bio-Informatics

tel : +32 9 264 59 87
Joris.Meys at Ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list