[R] Long jobs completing without output

Brendan Halpin brendan.halpin at ul.ie
Fri Dec 23 14:54:52 CET 2011


I've been running a glmer logit on a very large data set (600k obs). 

Running on a 10% subset works correctly, but for the complete data set,
R completes apparently without error, but does not display the results.
Given these jobs take about 200 hours, it's very hard to make progress
by trial and error.

I append the code and the sample and complete output. As is apparent, I
upgraded R during the complete run, but I recall testing on the
subsample with the earlier version too. I am also assuming that
upgrading R will not affect the running process -- is this true?

I'd be grateful for any leads. In the meantime I'll be running with
larger subsamples!

Regards,

Brendan Halpin


- code ---------------------------------------------------------------
library(arm)
library(foreign)
mlm <- read.dta("../workingdata.dta")
attach(mlm)

gender <- as.factor(stu_gend)

yr <- year - 1998
failure <- (lmer(fail ~
              1 + cao + subj1 + subj2 + subj3 + gender + yr + ageentry + as.factor(yrs5)
                + modsize  + meancao + depfemr + (1|deptno) + (1|modinst)  + (1|ulid) , 
              na.action = na.exclude, family = binomial (link="logit")))

display(failure, digits=5, detail=TRUE)
----------------------------------------------------------------------

- output with 10% sample data ----------------------------------------
R version 2.14.0 (2011-10-31)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: i486-pc-linux-gnu (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(arm)

arm (Version 1.4-13, built: 2011-6-19)
Working directory is /home/brendan/work/mlmmarks/genderECSR 
> library(foreign)
> mlm <- read.dta("../worksample-random1.dta")
> attach(mlm)
> 
> gender <- as.factor(stu_gend)
> 
> yr <- year - 1998
> failure <- (lmer(fail ~
+               1 + cao + subj1 + subj2 + subj3 + gender + yr + ageentry + as.factor(yrs5)
+                 + modsize  + meancao + depfemr + (1|deptno) + (1|modinst)  + (1|ulid) , na.action = na.exclude, family = binomial (link="logit")))
> 
> display(failure, digits=5, detail=TRUE)
glmer(formula = fail ~ 1 + cao + subj1 + subj2 + subj3 + gender + 
    yr + ageentry + as.factor(yrs5) + modsize + meancao + depfemr + 
    (1 | deptno) + (1 | modinst) + (1 | ulid), family = binomial(link = "logit"), 
    na.action = na.exclude)
                 coef.est  coef.se   z value   Pr(>|z|) 
(Intercept)        2.63826   0.97870   2.69568   0.00702
cao               -2.08963   0.11987 -17.43314   0.00000
subj1              0.02608   0.23573   0.11064   0.91190
subj2             -0.55668   0.32759  -1.69932   0.08926
subj3             -1.57120   0.30664  -5.12400   0.00000
genderM            0.36368   0.09188   3.95845   0.00008
yr                 0.06067   0.01658   3.65996   0.00025
ageentry          -0.00720   0.04338  -0.16598   0.86817
as.factor(yrs5)1  -0.25181   0.05712  -4.40806   0.00001
as.factor(yrs5)2  -0.54725   0.07601  -7.20005   0.00000
as.factor(yrs5)3  -1.07483   0.08660 -12.41184   0.00000
as.factor(yrs5)4  -1.22447   0.14373  -8.51932   0.00000
as.factor(yrs5)5  -1.55032   0.31342  -4.94653   0.00000
modsize            0.03387   0.02533   1.33733   0.18112
meancao            1.08747   0.10748  10.11780   0.00000
depfemr           -1.49097   0.49350  -3.02122   0.00252

Error terms:
 Groups   Name        Std.Dev.
 modinst  (Intercept) 1.14308 
 ulid     (Intercept) 1.54030 
 deptno   (Intercept) 0.52497 
 Residual             1.00000 
---
number of obs: 63254, groups: modinst, 9076; ulid, 2275; deptno, 26
AIC = 30275.2, DIC = 30237.2
deviance = 30237.2 
> 
Loading required package: MASS
Loading required package: Matrix
Loading required package: lattice

Attaching package: ‘Matrix’

The following object(s) are masked from ‘package:base’:

    det

Loading required package: lme4

Attaching package: ‘lme4’

The following object(s) are masked from ‘package:stats’:

    AIC, BIC

Loading required package: R2WinBUGS
Loading required package: coda

Attaching package: ‘coda’

The following object(s) are masked from ‘package:lme4’:

    HPDinterval

Loading required package: abind
Loading required package: foreign

Attaching package: ‘arm’

The following object(s) are masked from ‘package:coda’:

    traceplot
----------------------------------------------------------------------

- output with complete data ------------------------------------------
R version 2.13.1 (2011-07-08)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: i486-pc-linux-gnu (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(arm)

arm (Version 1.4-13, built: 2011-6-19)
Working directory is /home/brendan/work/mlmmarks/genderECSR 
> library(foreign)
> mlm <- read.dta("../workingdata.dta")
> attach(mlm)
> 
> gender <- as.factor(stu_gend)
> 
> yr <- year - 1998
> failure <- (lmer(fail ~
+               1 + cao + subj1 + subj2 + subj3 + gender + yr + ageentry + as.factor(yrs5)
+                 + modsize  + meancao + depfemr + (1|deptno) + (1|modinst)  + (1|ulid) , na.action = na.exclude, family = binomial (link="logit")))
Loading required package: MASS
Loading required package: Matrix
Loading required package: lattice

Attaching package: ‘Matrix’

The following object(s) are masked from ‘package:base’:

    det

Loading required package: lme4

Attaching package: ‘lme4’

The following object(s) are masked from ‘package:stats’:

    AIC, BIC

Loading required package: R2WinBUGS
Loading required package: coda

Attaching package: ‘coda’

The following object(s) are masked from ‘package:lme4’:

    HPDinterval

Loading required package: abind
Loading required package: foreign

Attaching package: ‘arm’

The following object(s) are masked from ‘package:coda’:

    traceplot
----------------------------------------------------------------------

-- 
Brendan Halpin,   Department of Sociology,   University of Limerick,   Ireland
Tel: w +353-61-213147  f +353-61-202569  h +353-61-338562;  Room F1-009 x 3147
mailto:brendan.halpin at ul.ie    ULSociology on Facebook: http://on.fb.me/fjIK9t
http://teaching.sociology.ul.ie/bhalpin/wordpress         twitter:@ULSociology



More information about the R-help mailing list