[Bioc-devel] R_CheckUserInterrupt guidelines (was Re: R_CheckUserInterrupt in IRanges)

Douglas Bates bates at stat.wisc.edu
Tue Jan 12 16:34:27 CET 2010


This is a bit off-topic but one idiom that I see in this code is the
use of INTEGER() or REAL() inside potentially long loops.  This is not
a good idea.  In packages INTEGER(), REAL(), etc. are implemented as
functions, not macros.  It is best to assign a pointer outside the
loop and work with that, avoiding the function call overhead.  I
realize this could be a minuscule change but if what you do with that
pointer within the loop is trivial it is best not to add the function
call overhead.

On Tue, Jan 12, 2010 at 2:26 AM, Robert Castelo <robert.castelo at upf.edu> wrote:
> hi, let me take the chance that you're talking about this to check
> whether anybody finds anything odd in the following strategy. in the
> package i develop (qpgraph - devel version) for the computationally
> intensive C-loops i calculate the percentage of progress and each time
> this percentage changes in one integer unit i call
> R_CheckUserInterrupt() and i also take the chance to process
> window-system events with R_ProcessEvents() (if i understand correctly
> this is what this function does). here is the code snippet:
>
> ppct=-1;
> k=0;
>
> for (i = 0; i < l_int-1; i++) {
>  int i2 = INTEGER(pairup_ij_int)[i] - 1;
>
>  for (j = i+1; j < l_int; j++) {
>    int j2 = INTEGER(pairup_ij_int)[j] - 1;
>
>    REAL(nrrMatrix)[i2+j2*n_var] = qp_edge_nrr(REAL(S),n_var,
> INTEGER(N)[0], i2, j2, q, INTEGER(nTests)[0],REAL(alpha)[0]);
>
>    REAL(nrrMatrix)[j2+i2*n_var] = REAL(nrrMatrix)[i2+j2*n_var];
>    k++;
>    pct = (int) ((k * 100) / n_adj);
>    if (pct != ppct) {
>      if (INTEGER(verbose)[0]) { ## show progress when
>        if (pct % 10 == 0)       ## verbose=TRUE
>          Rprintf("%d",pct);
>        else
>          Rprintf(".",pct);
>        R_FlushConsole();
>      }
>      R_CheckUserInterrupt();
> #ifdef Win32
>      R_ProcessEvents();
> #endif
> #ifdef HAVE_AQUA
>      R_ProcessEvents();
> #endif
>      ppct = pct;
>    }
>  }
> }
>
> let me clarify that the double nested loop iterates more or less through
> every pair of variables and thus this computation grows quadratically in
> the number of input variables in the dataset.
>
> robert.
>
> On Mon, 2010-01-11 at 20:07 -0800, Henrik Bengtsson wrote:
>> To 2nd what Seth's says, I think it is worth showing an explicit
>> example of how to check at regular intervals, e.g.  for (i = 0; i < n;
>> i++) { if (i %% 100 == 0) R_CheckUserInterrupt(); ... }
>>
>> /Henrik
>>
>> On Mon, Jan 11, 2010 at 3:50 PM, Seth Falcon <sfalcon at fhcrc.org> wrote:
>> > On 1/11/10 1:31 PM, Patrick Aboyoun wrote:
>> >>
>> >> I just modified the C or Fortran code section of package guidelines to
>> >> reflect this new entry:
>> >>
>> >> http://wiki.fhcrc.org/bioc/Package_Guidelines#c-or-fortran-code
>> >
>> > This looks reasonable.
>> >
>> > It is worth noting, however, that calling R_CheckUserInterrupt is not
>> > entirely free.  If you have a tight loop it might make more sense to check
>> > at intervals rather than each time through and the choice of when to check
>> > will depend very much on the computation.
>> >
>> > Here's an example using the inline package (code is at the bottom) with a
>> > silly double loop to compare checking vs not:
>> >
>> > With R_CheckUserInterrupt inner loop:
>> >
>> >> system.time(z <- funx(10000L, TRUE))
>> >   user  system elapsed
>> >  1.310   0.001   1.313
>> >
>> > No checking:
>> >
>> > system.time(z <- funx(10000L, FALSE))
>> >   user  system elapsed
>> >   0.08    0.00    0.08
>> >
>> > Now this may be completely unrealistic, but I suspect that a sensible take
>> > away message is that we should be making sure our C code that loops does
>> > call R_CheckUserInterrupt, but depending on the computation, not at every
>> > loop iteration.
>> >
>> > + seth
>> >
>> >
>> > ## example code is below.
>> >
>> > library("inline")
>> >
>> > code <- "
>> > int i, j, sum = 0, int n = INTEGER(max)[0];
>> > int check_inter = LOGICAL(do_check)[0];
>> > SEXP ans;
>> > PROTECT(ans = allocVector(INTSXP, n));
>> > for (i = 0; i < n; i++) {
>> >    INTEGER(ans)[i] = i;
>> >    for (j = 0; j < n; j++) {
>> >        sum = i + j;
>> >        if (check_inter) {
>> >            R_CheckUserInterrupt();
>> >        }
>> >    }
>> > }
>> > UNPROTECT(1);
>> > return ans;
>> > "
>> >
>> > funx <- cfunction(signature(max="integer", do_check="logical"), code)
>> >
>> > system.time(z <- funx(10000L, TRUE))
>> > system.time(z <- funx(10000L, FALSE))
>> >
>> >
>> > --
>> > Seth Falcon
>> > Bioconductor Core Team | FHCRC
>> >
>> > _______________________________________________
>> > Bioc-devel at stat.math.ethz.ch mailing list
>> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> >
>>
>> _______________________________________________
>> Bioc-devel at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>



More information about the Bioc-devel mailing list