[Rd] R_CheckUserInterrupt() can be a performance bottleneck within GUIs

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Wed Dec 18 13:15:37 CET 2024


>>>>> Simon Urbanek 
>>>>>     on Wed, 18 Dec 2024 13:19:04 +1300 writes:

    > It seems benign, but has implications since checking time
    > is actually not a cheap operation: adding jus ta time
    > check alone incurs a penalty of ca. 700% compared with the
    > time it takes to call R_CheckUserInterrupt().

Whoa!

     > Generally, it makes no sense to check interrupts at every iteration - you'll find code like if (++i % 10000 == 0) R_CheckUserInterrupt(); in loops to make sure it's not called unnecessarily. 

Thank you, Simaon.

Tomas Kalibera proposed an even faster version of if's to do the same, e.g. in
src/main/scan.c IIRC

I've patched (not yet committed) my version of wilcox.c  and
compared in R-devel (inside ESS; i.e., not Rstudio-crippled)
both without and with the patch:

## code on 1 line to easier cut'n'paste:
set.seed(101); twRdev <- replicate(20, {cat("."); W <- rwilcox(4e4,40,60); system.time(qwilcox(pwilcox(W,40,60), 40, 60))[1]})

summary(twRdev)
## _un_patched R Under development (unstable) (2024-12-17 r87446) -- "Unsuffered Consequences"
##   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.9060  0.9185  0.9255  0.9524  0.9768  1.0910 

## *PATCHED* R Under development (unstable) (2024-12-17 r87446) -- "Unsuffered Consequences"
summary(twRdev)
##   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.5000  0.5058  0.5075  0.5230  0.5210  0.6380 


I plan to commit a version of this later / tomorrow.

Martin


    > Cheers,
    > Simon


    >> On Dec 18, 2024, at 4:04 AM, Ben Bolker <bbolker using gmail.com> wrote:
    >> 
    >> This seems like a great idea. Would it help to escalate this to a post on R-bugzilla, so it is less likely to fall through the cracks?
    >> 
    >> On 12/17/24 09:51, Jeroen Ooms wrote:
    >>> A more generic solution would be for R to throttle calls to
    >>> R_CheckUserInterrupt(), because it makes no sense to check 1000 times
    >>> per second if a user has interrupted, but it is difficult for the
    >>> caller to know when R_CheckUserInterrupt() has been last called, or do
    >>> it regularly without over-doing it.
    >>> Here is a simple patch: https://github.com/r-devel/r-svn/pull/125
    >>> See also: https://stat.ethz.ch/pipermail/r-devel/2023-May/082597.html
    >>> On Tue, Dec 17, 2024 at 10:47 AM Martin Becker
    >>> <martin.becker using mx.uni-saarland.de> wrote:
    >>>> 
    >>>> tl;dr: R_CheckUserInterrupt() can be a performance bottleneck
    >>>> within GUIs. This also affects functions in the 'stats'
    >>>> package, which could be improved by changing the position
    >>>> of calls to R_CheckUserInterrupt().
    >>>> 
    >>>> 
    >>>> Dear all,
    >>>> 
    >>>> Recently I was puzzled because some code in a package under development,
    >>>> which consisted almost entirely of a .Call() to a function written in C,
    >>>> was running much slower within RStudio compared to R in a terminal. It
    >>>> took me some time to identify the cause, so I thought I would share my
    >>>> findings; perhaps they will be helpful to others.
    >>>> 
    >>>> The performance drop was caused by R_CheckUserInterrupt(), which I call
    >>>> (perhaps too often) in my C code. While calling R_CheckUserInterrupt()
    >>>> seems to be quite cheap when running R or Rscript in a terminal, it is
    >>>> more expensive when running R within a GUI, especially within RStudio,
    >>>> as I noticed (but also, e.g., within R.app on MacOS). In fact, using a
    >>>> GUI (especially RStudio) can change the cost of (frequent) calls to
    >>>> R_CheckUserInterrupt() from negligible to critical (in real-world
    >>>> applications). Significant performance drops are also visible for
    >>>> functions in the 'stats' package, e.g., pwilcox().
    >>>> 
    >>>> The following MWE (using Rcpp) illustrates the problem. Consider the
    >>>> following code:
    >>>> 
    >>>> ---
    >>>> 
    >>>> library(Rcpp)
    >>>> cppFunction('double nonsense(const int n, const int m, const int check) {
    >>>> int i, j;
    >>>> double result;
    >>>> for (i=0;i<n;i++) {
    >>>> if (check) R_CheckUserInterrupt();
    >>>> result = 1.;
    >>>> for (j=1;j<=m;j++) if (j%2) result *= j; else result /=j;
    >>>> }
    >>>> return(result);
    >>>> }')
    >>>> 
    >>>> tmp1 <- system.time(nonsense(1e8,10,0))[1]
    >>>> tmp2 <- system.time(nonsense(1e8,10,1))[1]
    >>>> cat("w/o check:",tmp1,"sec., with check:",tmp2,"sec.,
    >>>> diff.:",tmp2-tmp1,"sec.\n")
    >>>> 
    >>>> tmp3 <- system.time(pwilcox(rwilcox(1e5,40,60),40,60))[1]
    >>>> cat("wilcox example:",tmp3,"sec.\n")
    >>>> 
    >>>> ---
    >>>> 
    >>>> Running this code when R (4.4.2) is started in a terminal window
    >>>> produces the following measurements/output (Apple M1, MacOS 15.1.1):
    >>>> 
    >>>> w/o check: 0.525 sec., with check: 0.752 sec., diff.: 0.227 sec.
    >>>> wilcox example: 1.028 sec.
    >>>> 
    >>>> Running the same code when R is used within R.app (1.81 (8462)
    >>>> aarch64-apple-darwin20) on the same machine results in:
    >>>> 
    >>>> w/o check: 0.525 sec., with check: 1.683 sec., diff.: 1.158 sec.
    >>>> wilcox example: 2.13 sec.
    >>>> 
    >>>> Running the same code when R is used within RStudio Desktop (2024.12.0
    >>>> Build 467) on the same machine results in:
    >>>> 
    >>>> w/o check: 0.507 sec., with check: 22.905 sec., diff.: 22.398 sec.
    >>>> wilcox example: 29.686 sec.
    >>>> 
    >>>> So, the performance drop is already remarkable for R.app, but really
    >>>> huge for RStudio.
    >>>> 
    >>>> Presumably, checking for user interrupts within a GUI is more involved
    >>>> than within a terminal window, so there may not be much room for
    >>>> improvement in R.app or RStudio (and I know that this list is not the
    >>>> right place to suggest improvements for RStudio or to report unwanted
    >>>> behaviour). However, it might be worth considering
    >>>> 
    >>>> 1. an addition to the documentation in WRE (explaining that too many
    >>>> calls to R_CheckUserInterrupt() can cause a performance bottleneck,
    >>>> especially when the code is running within a GUI),
    >>>> 2. check (and possibly change) the position of R_CheckUserInterrupt() in
    >>>> some base R functions. For example, moving R_CheckUserInterrupt() from
    >>>> cwilcox() to pwilcox() and qwilcox() in src/nmath/wilcox.c may lead to a
    >>>> significant improvement (while still being feasible in terms of response
    >>>> time).
    >>>> 
    >>>> Best,
    >>>> Martin
    >>>> 
    >>>> 
    >>>> --
    >>>> apl. Prof. Dr. Martin Becker, Akad. Oberrat
    >>>> Lehrstab Statistik
    >>>> Quantitative Methoden
    >>>> Fakultät für Empirische Humanwissenschaften und Wirtschaftswissenschaft
    >>>> Universität des Saarlandes
    >>>> Campus C3 1, Raum 2.17
    >>>> 66123 Saarbrücken
    >>>> Deutschland
    >>>> 
    >>>> ______________________________________________
    >>>> R-devel using r-project.org mailing list
    >>>> https://stat.ethz.ch/mailman/listinfo/r-devel
    >>> ______________________________________________
    >>> R-devel using r-project.org mailing list
    >>> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 
    >> -- 
    >> Dr. Benjamin Bolker
    >> Professor, Mathematics & Statistics and Biology, McMaster University
    >> Director, School of Computational Science and Engineering
    >> * E-mail is sent at my convenience; I don't expect replies outside of working hours.
    >> 
    >> ______________________________________________
    >> R-devel using r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 

    > ______________________________________________
    > R-devel using r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list