[Rd] Inefficiency in df$col

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Mon Feb 4 18:15:51 CET 2019


On 04/02/2019 11:34 a.m., Martin Maechler wrote:
>>>>>> peter dalgaard
>>>>>>      on Mon, 4 Feb 2019 16:48:12 +0100 writes:
> 
>      > Does either of you have a patch against current R-devel?
>      > I tried the obvious, but the build dies with
> 
>      > building package 'tools'
>      > all.R is unchanged
>      > ../../../../library/tools/libs/x86_64/tools.so is unchanged
>      > installing 'sysdata.rda'
>      > Error in get(method, envir = home) : object '$.data.frame' not found
>      > Error: unable to load R code in package 'tools'
>      > Execution halted
> 
>      > ...and I can't really be arsed to dig into tools to see exactly where it is hardcoding the existence of $.data.frame.
> 
>      > -pd
> 
> Well, we two have been working "in parallel"...
> 
> I've just sent an e-mail to R-core about this:
> 
> It's really  file.size() which does need a 'data.frame' method for `$`
> because it is basically a wrapper  file.info(..)$size  and
> file.info(..) does return a data frame.
> 
> I've been suggesting the byte compiler's optimizations here to be
> the problem ...

I think Radford's point is that there is no difference between the 
behaviour of $ on a data.frame or a list (except for the wording of the 
warning message), so the $.list method (which is fast) is sufficient.

Duncan Murdoch

> 
>      >> On 4 Feb 2019, at 15:32 , Duncan Murdoch <murdoch.duncan using gmail.com> wrote:
>      >>
>      >> On 04/02/2019 9:20 a.m., Radford Neal wrote:
>      >>>>> I think you might want to just delete the definition of $.data.frame,
>      >>>>> reverting to the situation before R-3.1.0.
>      >>>>
>      >>>> I imagine the cause is that the list version is done in C code rather
>      >>>> than R code (i.e. there's no R function `$.list`).  So an alternative
>      >>>> solution would be to also implement `$.data.frame` in the underlying C
>      >>>> code.  This won't be quite as fast (it needs that test for NULL), but
>      >>>> should be close in the full match case.
>      >>> I maybe wasn't completely clear.  The $ operator for data frames was
>      >>> previously done in C - since it was done by the same primitive as for
>      >>> lists.  In R-3.1.0, this was changed - producing a massive slowdown -
>      >>> for the purpose of giving a warning on partial matches even if the
>      >>> user had not set the warnPartialMatchDollar option to TRUE.  In
>      >>> R-3.1.1, this was changed to not warn unless warnPartialMatchDollar was
>      >>> TRUE which was the PREVIOUS behaviour.  In other words, this change
>      >>> reverted the change made in R-3.1.0.  But instead of simply deleting
>      >>> the definition of $.data.frame, R-3.1.1 added extra code to it, the
>      >>> only effect of which is to slightly change the wording of the warning
>      >>> message from what is produced for any other list, while still retaining
>      >>> the massive slowdown.
>      >>> There is no need for you to write $.data.frame in C.  You just need
>      >>> to delete the version written in R.
>      >>
>      >> Sorry, I did misunderstand.  Thanks for the clarification.
>      >>
>      >> But if the "You" in your last sentence meant me, it needs to be "They": I am not a member of R Core and can't make any changes to the sources.
>      >>
>      >> Duncan Murdoch
>      >>
>      >> ______________________________________________
>      >> R-devel using r-project.org mailing list
>      >> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
>      > --
>      > Peter Dalgaard, Professor,
>      > Center for Statistics, Copenhagen Business School
>      > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>      > Phone: (+45)38153501
>      > Office: A 4.23
>      > Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com
> 
>      > ______________________________________________
>      > R-devel using r-project.org mailing list
>      > https://stat.ethz.ch/mailman/listinfo/r-devel
>



More information about the R-devel mailing list