[Rd] debugging strange segfault

Liaw, Andy andy_liaw at merck.com
Fri Jan 9 19:29:47 MET 2004


Thanks to DTL, BDR, KH and RG, I've found and fixed the bug.  The problem
was that I was offsetting an array being passed from C to Fortran, when I
shouldn't have.  Valgrind pinpointed the line that caused the trouble.  When
I found the bug, it makes me wonder why the code ever worked...

[I've just submitted the patched version to CRAN.]

Best,
Andy

> From: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk] 
> 
> The symptom is that of the compiled code overrunning the 
> storage it was 
> allocated.  I tracked down an instance (in SuppDists) 
> yesterday and have 
> seen quite a few in my time.
> 
> I would expect you to find a write to memory off one or other 
> end of an
> array, and compiling with bounds checking in place *may* 
> help.  (g77 is
> less useful than some commercial Fortran compilers at this, and some
> people write code that cannot be checked.)
> 
> Increasing some storage areas for output arrays will often make the 
> problem go away, but it is not actually a solution since it may just 
> relocate an illegal write to a non-fatal place.
> 
> Hope that helps enough,
> 
> Brian
> 
> On Thu, 8 Jan 2004, Liaw, Andy wrote:
> 
> > Dear R-devel,
> > 
> > Can anyone give me some hints on how to go about debugging a strange
> > segfault in my randomForest package?  Here's the scoop:
> > 
> > A user reported segfault when running predict() in the 
> randomForest package.
> > I asked for the data and code.  The combination runs fine 
> under WinXPPro,
> > but does give segfault on one of our Linux boxes running R 
> (1.7.0 through
> > R-devel_2004-01-08) on Mandrake 9.0. 
> > 
> > The predict.randomForest() function calls a C function 
> "runforest" via
> > .C(..., DUP=FALSE, ...), which in turns calls a Fortran subroutine
> > "testreebag" within a for loop.  The segfault seems to 
> occur right after
> > finishing the runforest() function in C and returning to R. 
>  I inserted the
> > line:
> > 
> > Rprintf("Done!\n");
> > 
> > as the last line of the runforest() function and got the 
> following output:
> > 
> > > library(randomForest, lib.loc="~/rlibs")
> > > arabid <- read.table('arabidopsis.out', sep=' ', header=T)
> > > arabid <- arabid[,-which(names(arabid) == "X0")]
> > > set.seed(1)
> > > fit <- randomForest(arabid[,-1], arabid[,1], ntree=100)
> > > predict(fit, arabid[,-1])
> > Done!
> > 
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x40152a48 in malloc () from /lib/libc.so.6
> > 
> > [If I change the DUP=FALSE in the .C() call to TRUE, I get 
> the following:
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x080b412b in Rf_duplicate (s=0x1) at duplicate.c:75
> > 75          switch (TYPEOF(s)) {
> > ]
> > 
> > At this point I'm clueless as to what to do next, and would 
> very much
> > appreciate any help!
> > 
> > Best,
> > Andy
> > 
> > Andy Liaw, PhD
> > Biometrics Research      PO Box 2000, RY33-300     
> > Merck Research Labs           Rahway, NJ 07065
> > mailto:andy_liaw at merck.com        732-594-0820
> > 
> > 
> > 
> > 
> > 
> --------------------------------------------------------------
> ----------------
> > Notice:  This e-mail message, together with any 
> attachments,...{{dropped}}
> > 
> > ______________________________________________
> > R-devel at stat.math.ethz.ch mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
> > 
> > 
> 
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
> 
> 
> 


------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments,...{{dropped}}



More information about the R-devel mailing list