[Rd] debugging question

Liaw, Andy andy_liaw at merck.com
Thu Apr 3 11:52:21 MEST 2003


Dear R-devel,

A user reported a strange problem with predict.randomForest in the
randomForest package yesterday, and I'm baffled by it.  The code at the end
of the message produces the error.  The problem is that, in
predict.randomForest, there's a .Fortran call to "runforest".  One of the
arguments passed in is "countts", which is a vector of doubles.  The error
occured because when the .Fortran call returned, that component of the
output is mysteriously turned into numeric(0)!  I checked that vector inside
the Fortran code (dimensioned as a matrix), and it looked fine.  Can anyone
provide some hint as to what the problem could be?

While I'm at it, can some one provide some tips on debugging Fortran code
with GDB?  The gdb manual has very little info on this topic.  For example,
how do I examine (print) arrays and values of arguments being passed in?

Any help much appreciated!

Regards,
Andy

Andy I. Liaw, PhD
Biometrics Research          Phone: (732) 594-0820
Merck & Co., Inc.              Fax: (732) 594-1565
P.O. Box 2000, RY84-16            Rahway, NJ 07065
mailto:andy_liaw at merck.com

==========================
library(randomForest)
library(mlbench)
data(Soybean)

Soybean <- Soybean[complete.cases(Soybean),]
## Drop empty levels:
Soybean <- lapply(Soybean, function(x) factor(as.character(x)))
nreps <- 100
rf.err <- numeric(nreps)

for (i in 1:nreps) {
  test <- sample(nrow(Soybean), 150, replace=FALSE)
  sb.rf <- randomForest(Class~., data=Soybean, subset=-test)
  sb.rf.pred <- predict(sb.rf, Soybean[test,])
  sb.rf.table <- table(sb.rf.pred, Soybean$Class[test])
  rf.err[i] <- sum(diag(sb.rf.table))
  print(1-sum(diag(sb.rf.table))/length(test))
}

------------------------------------------------------------------------------



More information about the R-devel mailing list