[R-pkg-devel] tibbles are not data frames

Stefan McKinnon Høj-Edwards sme at iysik.com
Tue Sep 26 11:23:51 CEST 2017


Thanks for the examples. Personally, I have been struck out multiple times
by data frames dropping dimensions, so I have a distaste for this dropping
behaviour.

Personally, I prefer data frame *not* to drop dimensions. They are not
arrays, where slicing drops a dimension makes sense because all entries are
same data type.
You can pull out a column in vector form from both tribbles and data frame
with the $ index; subsetting a row from a data frame and forcing it into an
atomic vector will require cast all columns to lowest common denominator,
often character.

So I would argue that yes, tribbles are data.frame with extra bells and
whistles, even if I do not understand the use of list columns.

I suggest a defensive coding technique; if you need a data frame subset to
really be a vector, cast it as a vector. Users *will* attempt to throw
unexpected structures at your methods. When your methods fails in
mysterious ways because it didn't extract a vector, users will be
stupefied. Fail at `as.vector` will indicate why.

Kindly,
Stefan

Stefan McKinnon Høj-Edwards
ph.d. Genetics
+44 (0)776 231 2464
+45 2888 6598
Skype: stefan_edwards

2017-09-26 10:05 GMT+01:00 Joris Meys <Joris.Meys at ugent.be>:

> Here's one difference:
>
> atib <- tibble(a = 1:5, b = letters[5:1])
> atib[3,"a"]
> as.data.frame(atib)[3,"a"]
>
> The second line returns a tibble (no dropping dimensions), the third line
> does (dropping dimensions). Huge difference if you use [ , aColumn] to
> select a vector from a data frame.
>
> Cheers
> Joris
>
> On Tue, Sep 26, 2017 at 10:57 AM, Stefan McKinnon Høj-Edwards <
> sme at iysik.com> wrote:
>
>> Hi Göran,
>>
>> Could you please elaborate on which kind of subsetting that Hadley
>> dislikes?
>> I am yet to encounter operations on data frames that are not possible on
>> tribbles.
>>
>> Kindly,
>> Stefan McKinnon Hoj-Edwards
>>
>> Stefan McKinnon Høj-Edwards
>> ph.d. Genetics
>> +44 (0)776 231 2464
>> +45 2888 6598
>> Skype: stefan_edwards
>>
>> 2017-09-26 8:30 GMT+01:00 Göran Broström <goran.brostrom at umu.se>:
>>
>> > I am beginning to get complaints from users of my CRAN packages
>> > (especially 'eha') to the effect that they get error messages like
>> "Error:
>> > Unsupported use of matrix or array for column indexing".
>> >
>> > It turns out that they are sticking in tibbles into functions that
>> expect
>> > data frames as input. And I am using the kind of subsetting that Hadley
>> > dislikes (eha is an old package, much older than tibbles). It is of
>> course
>> > a simple matter to change the code so it handles both data frames and
>> > tibbles correctly, but this affects many functions, and it will take
>> some
>> > time. And when the next guy introduces 'troubles' as an improvement of
>> > 'tibbles', I will have to rewrite the code again.
>> >
>> > While I like Hadley's way of doing it, I think it is a mistake to let a
>> > tibble also be of class data frame. To me it is a matter of inheritance
>> and
>> > backwards compability: A tibble should add nice things to a data frame,
>> not
>> > change basic behaviour, in order to call itself a data frame.
>> >
>> > Is it correct to let a tibble be of class "data.frame"?
>> >
>> > Göran Broström
>> >
>> > ______________________________________________
>> > R-package-devel at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-package-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>
>
>
>
> --
> Joris Meys
> Statistical consultant
>
> Ghent University
> Faculty of Bioscience Engineering
> Department of Mathematical Modelling, Statistics and Bio-Informatics
>
> tel : +32 9 264 59 87 <+32%209%20264%2059%2087>
> Joris.Meys at Ugent.be
> -------------------------------
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>

	[[alternative HTML version deleted]]



More information about the R-package-devel mailing list