[R] R coredumps when calling functions from rms package

Johannes Huesing johannes at huesing.name
Mon Oct 6 08:42:44 CEST 2014


My wife is trying to run some bootstrap model validation. She is using 
code to the effect of

library(rms)

residual <- sasxport.get("~/R_boot/residual.xpt") 

c1.odat <- somers2(residual$risk.5, residual$caco)["C"]        # c - statistic for original risk estimate in full sample #

fitboot <- function(data) {
                logitfit= lrm ( ... some model ..., data=data)

                absrisk.boot <- data$risk.5 * exp(predict(logitfit,data=data, type='lp') - logitfit$coefficients["Intercept"])

                ni  <- improveProb(x1=data$risk.5, x2=absrisk.boot,y=data$caco) 
                nri.boot <- ni[8]    
                idi.boot <- ni[19]
                # a half-sentence in the documentation tells us that this NRI is calculated independently of risk limits

                c1.boot <- somers2(data$risk.5, data$caco)["C"]        # c - statistic for absolute risk in the i.th sample #
                c2.boot <- somers2(absrisk.boot, data$caco)["C"]        # c - statistic for new risk estimate in the i.th sample #
			deltac.boot <- c2.boot - c1.boot				# delta-c

                absrisk.original <- residual$risk.5 * exp(predict(logitfit,data=residual, type='lp') - logitfit$coefficients["Intercept"])

                ni.Odat  = improveProb(x1=residual$risk.5, x2=absrisk.original, y=residual$caco) 
                nri.odat = ni.Odat[8]                                     
                idi.odat = ni.Odat[19]
                # a half-sentence in the documentation tells us that this NRI is calculated independently of risk limits

                c2.odat <- somers2(absrisk.original, residual$caco)["C"]      # c - statistic for new risk score on original sample 
			deltac.odat <- c2.odat - c1.odat					# delta c full sample	

                result.both <- unlist(c(nri.boot = nri.boot, idi.boot = idi.boot, deltac.boot = deltac.boot, nri.odat = nri.odat, idi.odat = idi.odat, deltac.odat = deltac.odat))             
		# combining results of NRI, IDI and C-statistics from full sample, putting the values to the data frame called 'result.both' 
                result.both
                }

set.seed(1111)

anziter <- 100

boot.res <- matrix(nrow=anziter, ncol=6)


for (i in 1:anziter) {
    boot.id <- sample(nrow(residual), nrow(residual), replace = TRUE)
    bootdat <- residual[boot.id, ]
    boot.res[i, ] <- fitboot(bootdat)
}

When run within Rgui under Windows, Rgui will terminate showing a
dialog promising that Rgui is faulty and she will be notified once
this is fixed. When run from Rscript within the Powershell, Rscript
will terminate with a message where "Rgui" is replaced by
"Rscript". When run on a different machine within Ubuntu (Trusty
Tahr), R will complain that Hmisc is in a version not recent enough
(which seems to be a problem with the current Ubuntu LTS
distribution). After fixing it, R (run within Emacs) will terminate
with a memory address error, offering a core dump.

Tossing in some trace instructions will reveal that the problems seem
to occur after entering the predict() or improveProb() function
body. In fact, when run from ESS it will print a backtrace revealing
that it was in the middle of predict(). Sometimes (even without
altering the seed) R will terminate the script with the message "GC
encountered a node (…) with an unknown SEXP type", but this has been
observed only in the Rgui calls. 

Is there a way to trap these occurrences and continue the script, 
ignoring all runs that lead to unforeseen behaviour? (So far I have 
suggested to save() the whole matrix of intermediate results to 
a file after each run, and start the script as a cron job every 
five minutes, and compile all the results to one big array after
sufficiently enough successful iterations.)

-- 
Johannes Hüsing               There is something fascinating about science. 
                              One gets such wholesale returns of conjecture 
mailto:johannes at huesing.name  from such a trifling investment of fact.                
http://derwisch.wikidot.com         (Mark Twain, "Life on the Mississippi")



More information about the R-help mailing list