[R] R coredumps when calling functions from rms package
Johannes Huesing
johannes at huesing.name
Mon Oct 6 08:42:44 CEST 2014
My wife is trying to run some bootstrap model validation. She is using
code to the effect of
library(rms)
residual <- sasxport.get("~/R_boot/residual.xpt")
c1.odat <- somers2(residual$risk.5, residual$caco)["C"] # c - statistic for original risk estimate in full sample #
fitboot <- function(data) {
logitfit= lrm ( ... some model ..., data=data)
absrisk.boot <- data$risk.5 * exp(predict(logitfit,data=data, type='lp') - logitfit$coefficients["Intercept"])
ni <- improveProb(x1=data$risk.5, x2=absrisk.boot,y=data$caco)
nri.boot <- ni[8]
idi.boot <- ni[19]
# a half-sentence in the documentation tells us that this NRI is calculated independently of risk limits
c1.boot <- somers2(data$risk.5, data$caco)["C"] # c - statistic for absolute risk in the i.th sample #
c2.boot <- somers2(absrisk.boot, data$caco)["C"] # c - statistic for new risk estimate in the i.th sample #
deltac.boot <- c2.boot - c1.boot # delta-c
absrisk.original <- residual$risk.5 * exp(predict(logitfit,data=residual, type='lp') - logitfit$coefficients["Intercept"])
ni.Odat = improveProb(x1=residual$risk.5, x2=absrisk.original, y=residual$caco)
nri.odat = ni.Odat[8]
idi.odat = ni.Odat[19]
# a half-sentence in the documentation tells us that this NRI is calculated independently of risk limits
c2.odat <- somers2(absrisk.original, residual$caco)["C"] # c - statistic for new risk score on original sample
deltac.odat <- c2.odat - c1.odat # delta c full sample
result.both <- unlist(c(nri.boot = nri.boot, idi.boot = idi.boot, deltac.boot = deltac.boot, nri.odat = nri.odat, idi.odat = idi.odat, deltac.odat = deltac.odat))
# combining results of NRI, IDI and C-statistics from full sample, putting the values to the data frame called 'result.both'
result.both
}
set.seed(1111)
anziter <- 100
boot.res <- matrix(nrow=anziter, ncol=6)
for (i in 1:anziter) {
boot.id <- sample(nrow(residual), nrow(residual), replace = TRUE)
bootdat <- residual[boot.id, ]
boot.res[i, ] <- fitboot(bootdat)
}
When run within Rgui under Windows, Rgui will terminate showing a
dialog promising that Rgui is faulty and she will be notified once
this is fixed. When run from Rscript within the Powershell, Rscript
will terminate with a message where "Rgui" is replaced by
"Rscript". When run on a different machine within Ubuntu (Trusty
Tahr), R will complain that Hmisc is in a version not recent enough
(which seems to be a problem with the current Ubuntu LTS
distribution). After fixing it, R (run within Emacs) will terminate
with a memory address error, offering a core dump.
Tossing in some trace instructions will reveal that the problems seem
to occur after entering the predict() or improveProb() function
body. In fact, when run from ESS it will print a backtrace revealing
that it was in the middle of predict(). Sometimes (even without
altering the seed) R will terminate the script with the message "GC
encountered a node (…) with an unknown SEXP type", but this has been
observed only in the Rgui calls.
Is there a way to trap these occurrences and continue the script,
ignoring all runs that lead to unforeseen behaviour? (So far I have
suggested to save() the whole matrix of intermediate results to
a file after each run, and start the script as a cron job every
five minutes, and compile all the results to one big array after
sufficiently enough successful iterations.)
--
Johannes Hüsing There is something fascinating about science.
One gets such wholesale returns of conjecture
mailto:johannes at huesing.name from such a trifling investment of fact.
http://derwisch.wikidot.com (Mark Twain, "Life on the Mississippi")
More information about the R-help
mailing list