[Rd] R_check_class_etc(x, valid) is "slow" when 'valid' contains class(x)

Tomas Kalibera tom@@@k@||ber@ @end|ng |rom gm@||@com
Thu Sep 8 15:23:40 CEST 2022


On 02/09/2022 14:51, Mikael Jagan wrote:
>
>
> On 2022-09-02 5:30 am, Tomas Kalibera wrote:
>> Hi Mikael,
>>
>> On 8/28/22 01:13, Mikael Jagan wrote:
>>> R_check_class_etc(x, valid) spends a nontrivial amount of time finding
>>> an environment 'rho' containing the definition of class(x), evaluating
>>> (in R, not C) methods::.classEnv(class(x)).
>>>
>>> It then returns the result of R_check_class_and_super(x, valid, rho).
>>> But R_check_class_and_super() does not use 'rho' at all in the trivial
>>> case where class(x) is found in 'valid'.
>> right, that could be improved. Do you have an example which exhibits 
>> the problem, have you found this by profiling something?
> I benchmarked .Call("R_etc", x) for x <- new("dgTMatrix"), after 
> compiling
> and loading the following:
>
> ```
> #include <Rinternals.h>
>
> SEXP R_etc(SEXP x) {
>     static const char *valid[] = {"dgCMatrix", "dgRMatrix", 
> "dgTMatrix", ""};
>     return ScalarInteger(R_check_class_etc(x, valid));
> }
> ```
>
> R-devel used ~2 microseconds while my patched version of R used ~0.2
> microseconds.  So, not a bottleneck by any means.  On the other hand,
> 'Matrix' and probably other packages do call R_check_class_etc() quite
> often, hence my report.
I see, ideally one would have a real scenario with a real performance 
issue leading to this. With a micro-benchmark, one could come up with 
pretty much arbitrary speedup, so it is harder to judge whether an 
optimization is actually worth the effort, the risk of introducing bugs 
and sometimes the added complexity.
>>>
>>> My feeling is that this can be improved.  I am happy to contribute a 
>>> patch,
>>> if it would be considered by R-core.
>>
>> Both R_check_class_etc and R_check_class_and_super are unfortunately 
>> exported, the former is used a lot in packages (even though they are 
>> not mentioned in Writing R Extensions, so actually shouldn't be used 
>> in packages). Anyway, it would be easier if we could preserve their 
>> interface and behavior.
>>
>> Maybe we could support rho==NULL in R_check_class_and_super, the 
>> environment would be looked up in that case when needed. 
>> R_check_class_etc would simply only call R_check_class_and_super with 
>> that argument. I see that R_check_class_and_super uses asChar() on 
>> the class attribute, while R_check_class_etc does not currently for 
>> looking for the environment, but I assume doing that in both cases 
>> should not matter (and it would have to be tested).
>>
>> So this would be a trivial change, but if you wanted to create a 
>> minimal patch, I will be happy to have a look.
>
> I've implemented essentially what you've described; see attached.

Thanks, I've restored the original behavior with pkg == R_NilValue (note 
rho was R_GlobalEnv), I've used C NULL to mark the special mode when 
"rho" should be found, and I made some small tweaks. I will add after 
more testing, if the results are ok.

Best
Tomas

>
> Mikael
>
>>
>> Best
>> Tomas
>>
>>
>>> Mikael
>>>
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list