[Rd] Inefficiency in df$col
Duncan Murdoch
murdoch@dunc@n @end|ng |rom gm@||@com
Mon Feb 4 18:15:51 CET 2019
On 04/02/2019 11:34 a.m., Martin Maechler wrote:
>>>>>> peter dalgaard
>>>>>> on Mon, 4 Feb 2019 16:48:12 +0100 writes:
>
> > Does either of you have a patch against current R-devel?
> > I tried the obvious, but the build dies with
>
> > building package 'tools'
> > all.R is unchanged
> > ../../../../library/tools/libs/x86_64/tools.so is unchanged
> > installing 'sysdata.rda'
> > Error in get(method, envir = home) : object '$.data.frame' not found
> > Error: unable to load R code in package 'tools'
> > Execution halted
>
> > ...and I can't really be arsed to dig into tools to see exactly where it is hardcoding the existence of $.data.frame.
>
> > -pd
>
> Well, we two have been working "in parallel"...
>
> I've just sent an e-mail to R-core about this:
>
> It's really file.size() which does need a 'data.frame' method for `$`
> because it is basically a wrapper file.info(..)$size and
> file.info(..) does return a data frame.
>
> I've been suggesting the byte compiler's optimizations here to be
> the problem ...
I think Radford's point is that there is no difference between the
behaviour of $ on a data.frame or a list (except for the wording of the
warning message), so the $.list method (which is fast) is sufficient.
Duncan Murdoch
>
> >> On 4 Feb 2019, at 15:32 , Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
> >>
> >> On 04/02/2019 9:20 a.m., Radford Neal wrote:
> >>>>> I think you might want to just delete the definition of $.data.frame,
> >>>>> reverting to the situation before R-3.1.0.
> >>>>
> >>>> I imagine the cause is that the list version is done in C code rather
> >>>> than R code (i.e. there's no R function `$.list`). So an alternative
> >>>> solution would be to also implement `$.data.frame` in the underlying C
> >>>> code. This won't be quite as fast (it needs that test for NULL), but
> >>>> should be close in the full match case.
> >>> I maybe wasn't completely clear. The $ operator for data frames was
> >>> previously done in C - since it was done by the same primitive as for
> >>> lists. In R-3.1.0, this was changed - producing a massive slowdown -
> >>> for the purpose of giving a warning on partial matches even if the
> >>> user had not set the warnPartialMatchDollar option to TRUE. In
> >>> R-3.1.1, this was changed to not warn unless warnPartialMatchDollar was
> >>> TRUE which was the PREVIOUS behaviour. In other words, this change
> >>> reverted the change made in R-3.1.0. But instead of simply deleting
> >>> the definition of $.data.frame, R-3.1.1 added extra code to it, the
> >>> only effect of which is to slightly change the wording of the warning
> >>> message from what is produced for any other list, while still retaining
> >>> the massive slowdown.
> >>> There is no need for you to write $.data.frame in C. You just need
> >>> to delete the version written in R.
> >>
> >> Sorry, I did misunderstand. Thanks for the clarification.
> >>
> >> But if the "You" in your last sentence meant me, it needs to be "They": I am not a member of R Core and can't make any changes to the sources.
> >>
> >> Duncan Murdoch
> >>
> >> ______________________________________________
> >> R-devel using r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> > --
> > Peter Dalgaard, Professor,
> > Center for Statistics, Copenhagen Business School
> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> > Phone: (+45)38153501
> > Office: A 4.23
> > Email: pd.mes using cbs.dk Priv: PDalgd using gmail.com
>
> > ______________________________________________
> > R-devel using r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
More information about the R-devel
mailing list