[R] R 2.1.0 RH Linux Built from Source Segmentation Fault

Fri May 20 00:20:49 CEST 2005

Background:

I administer a cluster of RedHat EWS 3U4 Linux workstations at a university.

I built R 2.1.0 from source:

./configure \
         --prefix=/sscc/opt/R-2.1.0 \
         --with-blas=no \
         2>&1 \
         | tee NUInstall.configure

R is now configured for i686-pc-linux-gnu

   Source directory:          .
   Installation directory:    /sscc/opt/R-2.1.0

   C compiler:                gcc  -g -O2
   C++ compiler:              g++  -g -O2
   Fortran compiler:          g77  -g -O2

   Interfaces supported:      X11, tcltk
   External libraries:        readline
   Additional capabilities:   PNG, JPEG, iconv, MBCS, NLS
   Options enabled:           R profiling

   Recommended packages:      yes

configure: WARNING: you cannot build info or html versions of the R manuals

The machines are AMD Athlon MP 2400+ with 2 GB RAM, dual CPUs, and 
lots of free disk space.

I've got a user running Monte Carlo codes that fail with segmentation 
faults on a frequent basis. The jobs run for a long time (up to a 
day) before failure.

If a failed job is rerun, chances are high that it will run to completion.

I'm at a loss about approaching this problem. R (as it is here) 
doesn't seem to give much of a hint as to where things are when it 
crashes.

I'm looking for some guidance to diagnose this problem so we can 
focus on a solution.

Thanks!

Here's the annotated output of a failed job. The source file 
bayes_book_r_functions.R comes from Peter Rossi:

http://gsbwww.uchicago.edu/fac/peter.rossi/research/bayes%20book/R%20functions/

The second source file is included below.

R : Copyright 2005, The R Foundation for Statistical Computing
Version 2.1.0  (2005-04-18), ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

   Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for a HTML browser interface to help.
Type 'q()' to quit R.

[Previously saved workspace restored]

>  # program name: mnl.R
>  # my hierarchical bayes logit model using random walk algorithm
>  #
>  # nphy= number of physicians in the sample
>  # nvar = no. of variables in X
>  # nalt = no. of alternatives
>  # nobs = no. of observations
>  # yrows (nphy x 1 matrix) contains no. of observations by each physician
>  # X (nobs*nalt x nvar) contains xvalues for each of j alt on each 
>of n occasions
>  # Y (nobs x 1) contains chosen alternative
>  #
>
>  source("bayes_book_r_functions.R")
>  nalt=5
>
>  X=read.table("x300_new.txt",header=TRUE)
>  X=as.matrix(X)
>  X=cbind(X[,1:4],X[,10:11])
>  nvar=ncol(X)
>  Y=read.table("y300_new.txt",header=TRUE)
>  Y=as.matrix(Y)
>  yrows=read.table("yrows300_new.txt",header=TRUE)
>  nphy=nrow(yrows)
>  nobs=sum(yrows)/nalt
>  if (nrow(X)!=nobs*nalt){print("data dimensions wrong")}
>
>  betastore=NULL
>  delta=diag(nvar)
>  z=read.table("z300.txt",header=TRUE)
>  z=as.matrix(z)
>  k=ncol(z)
>  A1=.1*diag(k)
>  nu1=3+nalt
>  V=diag(nvar)*nu1
>
>  rowx1=rep(0,nphy+1)
>  rowx2=rep(0,nphy)
>  rowy1=rep(0,nphy+1)
>  rowy2=rep(0,nphy)
>  rowx1[1]=1
>  rowy1[1]=1
>  for (i in 1:nphy){
+ # for each physician i draw the relevant X and Y obs using yrows
+ #
+ rowx2[i]=rowx1[i]-1+yrows[i,1]
+ rowy2[i]=rowy1[i]-1+yrows[i,1]/nalt
+ rowx1[i+1]=rowx2[i]+1
+ rowy1[i+1]=rowy2[i]+1
+ }
>
>  R=100000
>  keep=10
>  thetas=matrix(rep(0,(R/keep+1)*nvar*k),ncol=nvar*k)
>  theta=matrix(rep(0,nvar*k),nrow=k)
>  thetabar=theta
>  source("rmnlRwMetrop1.R")
>  scale=.5
>  beta=matrix(rep(0,nphy*nvar),byrow=TRUE,ncol=nvar)
>  accept=rep(0,nphy)
>  accepts=rep(0,R/keep)
>  parameters=list(R=1,keep=1,s=scale)
>  for (j in 1:R)
+ {
+ itime=proc.time()[3]
+ for (i in 1:nphy){
+       # for each physician i draw a beta[i] using MH algorithm
+       #
+       Data=list(m=nalt,X=X[rowx1[i]:rowx2[i],],y=Y[rowy1[i]:rowy2[i],])
+       bbar=t(theta)%*%z[i,]
+       prior=list(A=delta,betabar=bbar)
+       a=rmnlRwMetrop1(Data,prior,Mcmc=parameters,beta[i,])
+       beta[i,]=a$betadraw
+       accept[i]=a$acceptr
+       }
+ mregout=rmultireg(beta,z,thetabar,A1,nu1,V)
+ delta=mregout$Sigma
+ theta=mregout$B
+ if (j%%keep==0){
+ thetas[j/keep,]=as.vector(theta)
+ betastore[j/keep]=list(beta)
+ accepts[j/keep]=mean(accept)
+ ftime=proc.time()[3]
+ cat('Time taken by iteration: ',j, ' = ',round((ftime-itime)/60,2),'\n')
+ }
+ if (j%%10000 == 0)
+ {cat('Mean theta at iteration: ',j,' = 
',apply(thetas[1:j/keep,],2,mean),'\n')
+ cat('Mean sd of theta at iteration: ',j,' = 
',apply(thetas[1:j/keep,],2,sd),'\n')
+ }
+ }
Time taken by iteration:  10  =  0.01
Time taken by iteration:  20  =  0.01
Time taken by iteration:  30  =  0.01
Time taken by iteration:  40  =  0.01
Time taken by iteration:  50  =  0.01

   ... many lines deleted

Time taken by iteration:  92800  =  0.01
Time taken by iteration:  92810  =  0.01
Time taken by iteration:  92820  =  0.01
Time taken by iteration:  92830  =  0.01
Time taken by iteration:  92840  =  0.01

That's it!

The PBS output is:

==============Original message text===============
    From: "seldon.it.northwestern.edu" <root at seldon.it.northwestern.edu>
    Date: Wed, 18 May 2005 4:47:19 pm CDT
    Subject: PBS JOB 1534.seldon

PBS Job Id: 1534.seldon
Job Name:   mnl300_z
Execution terminated
Exit_status=139
resources_used.cpupercent=98
resources_used.cput=00:25:02
resources_used.mem=233536kb
resources_used.ncpus=1
resources_used.vmem=250252kb
resources_used.walltime=00:25:13

===========End of original message text===========

The second source file contains:

rmnlRwMetrop1=function(Data,Prior,Mcmc,beta0)
{
#
# purpose:
#   draw from posterior for MNL using Independence Metropolis
#
# Arguments:
#   Data - list of m,X,y
#     m is number of alternatives
#     X is nobs*m x nvar matrix
#     y is nobs vector of values from 1 to m
#   Prior - list of A, betabar
#     A is nvar x nvar prior preci matrix
#     betabar is nvar x 1 prior mean
#   Mcmc
#     R is number of draws
#     keep is thinning parameter
#     s is scaling parameter for random walk
#   beta0 is initial beta
#
# Output:
#   list of betadraws
#
# Model:   Pr(y=j) = exp(x_j'beta)/sum(exp(x_k'beta)
#
# Prior:   beta ~ N(betabar,A^-1)
#
# check arguments
#
X=Data$X
y=Data$y
m=Data$m
nvar=ncol(X)
nobs=length(y)
# check for Prior
#
if(missing(Prior))
    { betabar=c(rep(0,nvar)); A=diag(rep(.01,nvar))}
else
    {
     if(is.null(Prior$betabar)) {betabar=c(rep(0,nvar))}
        else {betabar=Prior$betabar}
     if(is.null(Prior$A)) {A=matrix(rep(.01,nvar*nvar),ncol=nvar)}
        else {A=Prior$A}
    }
R=Mcmc$R
keep=Mcmc$keep
s=Mcmc$s
# Check beta0 argument
#
if (missing(beta0)) {beta0=c(rep(0,nvar))}
#
betadraw=matrix(double(floor(R/keep)*nvar),ncol=nvar)
#
# compute required quantities for indep candidates
#
oldbeta=beta0
mhess=diag(nvar)
candcov=chol2inv(chol(mhess))
root=s*chol(candcov)
priorcov=chol2inv(chol(A))
rootp=chol(priorcov)
rootpi=backsolve(rootp,diag(nvar))

#
#       start main iteration loop
#
oldlpost=llmnl(y,X,beta0)+lmvn(beta0,betabar,rootpi)
naccept=0

for (rep in 1:R)
{
    betac=oldbeta+t(root)%*%rnorm(nvar)
    clpost=llmnl(y,X,betac)+lmvn(betac,betabar,rootpi)
    ldiff=clpost-oldlpost
    alpha=min(1,exp(ldiff))
    if(alpha < 1) {unif=runif(1)} else {unif=0}
    if (unif <= alpha)
       { beta0=betac
         oldlpost=clpost
         naccept=naccept+1}
    oldbeta=beta0

   if(rep%%keep == 0)
     {mkeep=rep/keep; betadraw[mkeep,]=beta0}
}
list(betadraw=betadraw,acceptr=naccept/R)
}