[R-pkg-devel] Absent variables and tibble

Duncan Murdoch murdoch.duncan at gmail.com
Mon Jun 27 18:33:13 CEST 2016

On 27/06/2016 10:46 AM, Hadley Wickham wrote:
> On Mon, Jun 27, 2016 at 9:03 AM, Duncan Murdoch
> <murdoch.duncan at gmail.com> wrote:
> > On 27/06/2016 9:22 AM, Lenth, Russell V wrote:
> >>
> >> My package 'lsmeans' is now suddenly broken because of a new provision in
> >> the 'tibble' package (loaded by 'dplyr' 0.5.0), whereby the "[[" and "$"
> >> methods for 'tbl_df' objects - as documented - throw an error if a variable
> >> is not found.
> >>
> >> The problem is that my code uses tests like this:
> >>
> >>         if (is.null (x$var)) {...}
> >>
> >> to see whether 'x' has a variable 'var'. Obviously, I can work around this
> >> using
> >>
> >>         if (!("var" %in% names(x))) {...}
> >>
> >> but (a) I like the first version better, in terms of the code being
> >> understandable; and (b) isn't there a long history whereby we can expect a
> >> NULL result when accessing an absent member of a list (and hence a
> >> data.frame)? (c) the code base for 'lsmeans' has about 50 instances of such
> >> tests.
> >>
> >> Anyway, I wonder if a lot of other package developers test for absent
> >> variables in that first way; if so, they too are in for a rude awakening if
> >> their users provide a tbl_df instead of a data.frame. And what is considered
> >> the best practice for testing absence of a list member? Apparently, not
> >> either of the above; and because of (c), I want to do these many tedious
> >> corrections only once.
> >>
> >> Thanks for any light you can shed.
> >
> >
> > This is why CRAN asks that people test reverse dependencies.
> Which we did do - the problem is that this is actually caused by a
> recursive reverse dependency (lsmeans -> dplyr -> tibble), and we
> didn't correctly anticipate how much pain this would cause.

In fact, it's even harder than that, according to a message Russell sent 
me in private.  Neither package depends on the other; it happens when a 
user passes a 'tbl_df' object to Russell's package, and the tibble 
methods get called for it.  This is an unfortunate consequence of the S3 
system:  there's no place to define exactly what S3 methods are supposed 
to do, and no easy way for a package writer to test against all possible 
objects that might get passed in.

I guess my advice would be not to trigger an error in a case like this, 
though you might want to lobby for the base "[[" and "$" methods to 
(optionally?) do so.

Duncan Murdoch

More information about the R-package-devel mailing list