[Rd] [External] GC: improving the marking performance for STRSXPs

iuke-tier@ey m@iii@g oii uiow@@edu iuke-tier@ey m@iii@g oii uiow@@edu
Tue Oct 19 01:45:23 CEST 2021


Thanks. I have committed a modified version, also incorporating the
handling of R_StringHash from your other post, in r81073. I prefer to
be more conservative in the GC. for example not assume without
checking that STRSXP elements are CHARSXP. This does add some
overhead, but the change is still beneficial.

I don't think we would want to add the complexity of threading at this
point, though it might be worth considering at a later time. There are
a few other possible modifications that I'll explore that might
provide comparable improvements to the ones seen with your patch
without adding the complexity of threads.

Best,

luke

On Thu, 7 Oct 2021, Andreas Kersting wrote:

> Hi all,
>
> in GC (in src/main/memory.c), FORWARD_CHILDREN() (called by PROCESS_NODES()) treats STRSXPs just like VECSXPs, i.e. it calls FORWARD_NODE() for all its children. I claim that this is unnecessarily inefficient since the children of a STRSXP can legitimately only be (atomic) CHARSXPs and could hence be marked directly in the call of FORWARD_CHILDREN() on the STRSXP.
>
> Attached patch (atomic_CHARSXP.diff) implements this and gives the following performance improvements on my system compared to R devel (revision 81008):
>
> Elapsed time for two full gc in a session after
>
> x <- as.character(runif(5e7))[]
>
> 19sec -> 15sec.
>
> This is the best-case scenario for the patch: very many unique/unmarked CHARSXP in the STRSXP. For already marked CHARSXP there is no performance gain since FORWARD_NODE() is a no-op for them.
>
> The relative performance gain is even bigger if iterating through the STRSXP produces many cache misses, as e.g. after
>
> x <- as.character(runif(5e7))[]
> x <- sample(x, length(x))
>
> Elapsed time for two full gc here: 83sec -> 52sec. This is because we have less cache misses per CHARSXP.
>
> This patch additionally also assumes that the ATTRIBs of a CHARSXP are not to be traced because they are just used for maintaining the CHARSXP hash chains.
>
> The second attached patch (atomic_CHARSXP_safe_unlikely.diff) checks both assumptions and calls gc_error() if they are violated and is still noticeably faster than R devel: 19sec -> 17sec and 83sec -> 54sec, respectively.
>
> Attached gc_test.R is the script I used to get the previously mentioned and more gc timings.
>
> Do you think that this is a reasonable change? It does make the code more complex and I am not sure if there might be situations in which the assumptions are violated, even though SET_STRING_ELT() and installAttrib() do enforce them.
>
> Best regards,
> Andreas

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney using uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu


More information about the R-devel mailing list