[R] lmer() causes segfault

Wed Aug 25 17:04:30 CEST 2010

Dennis,

just wow. Thank you so much. I knew it was something trivial - in this  
case the variable type of the of the grouping variables. However,  
something as trivial as this should not throw a segfault IMHO. I tried  
subscribing to R-sig-mixed this morning, but the corresponding mail  
server at the ETH's stats department seems to be down. And thank you  
so much for changing the model, that is a great new starting point.  
Can you recommend a good book that deals with multilevel models in  
lmer() that include longitudinal data? I was not aware of the  
difference between scalar random effects and random slopes and would  
like to read up on that.

Again, thanks a lot.
Regards,
Bertolt

Am 25.08.2010 um 13:47 schrieb Dennis Murphy:

> Hi:
>
> Let's start with the data:
>
> > str(test.data)
> 'data.frame':   100 obs. of  4 variables:
>  $ StudentID: num  17370 17370 17370 17370 17379 ...
>  $ GroupID  : num  1 1 1 1 1 1 1 1 1 1 ...
>  $ Time     : num  1 2 3 4 1 2 3 4 1 2 ...
>  $ Score    : num  76.8 81.8 89.8 92.8 75.9 ...
>
> Both StudentID and GroupID are numeric; in the model, they would be  
> treated as continuous covariates rather than factors, so we need to  
> convert:
>
> test.data$StudentID <- factor(test.data$StudentID)
> test.data$GroupID <- factor(test.data$GroupID)
>
> Secondly, I believe there are some flaws in your model. After  
> converting your variables to factors, I ran
>
> library(lme4)
> mlmoded1.lmer <- lmer(Score ~ Time + (Time | GroupID/StudentID),  
> data = test.data)
>
> You have two groups, so they should be treated as a fixed effect -  
> more specifically, as a fixed blocking factor. The StudentIDs are  
> certainly nested within GroupID, and Time is measured on each  
> StudentID, so it is a repeated measures factor. The output of this  
> model is
> > mlmoded1.lmer
> Linear mixed model fit by REML
> Formula: Score ~ Time + (Time | GroupID/StudentID)
>    Data: test.data
>    AIC   BIC logLik deviance REMLdev
>  393.1 416.5 -187.5    376.9   375.1
> Random effects:
>  Groups            Name        Variance  Std.Dev. Corr
>  StudentID:GroupID (Intercept)  0.504131 0.71002
>                    Time         0.083406 0.28880  1.000
>  GroupID           (Intercept) 12.809567 3.57905
>                    Time         3.897041 1.97409  -1.000
>  Residual                       1.444532 1.20189
> Number of obs: 100, groups: StudentID:GroupID, 25; GroupID, 2
>
> Fixed effects:
>             Estimate Std. Error t value
> (Intercept)   72.803      2.552  28.530
> Time           4.474      1.401   3.193
>
> Correlation of Fixed Effects:
>      (Intr)
> Time -0.994
>
> The high correlations among the random effects and then among the  
> fixed effects suggests that the model specification may be a bit off.
>
> The above model fits random slopes to GroupIDs and StudentIDs, along  
> with random intercepts, but GroupID is a between-subject effect and  
> should be at the top level. Time is a within-subject effect and  
> StudentIDs are the observational units. I modified the model to  
> provide fixed effects for GroupIDs, scalar random effects for  
> StudentIDs and random slopes for StudentIDs.
>
> > mod3 <- lmer(Score ~ 1 + GroupID + Time + (1 | StudentID) +
> +              (0 + Time | StudentID), data = test.data)
> > mod3
> Linear mixed model fit by REML
> Formula: Score ~ 1 + GroupID + Time + (1 | StudentID) + (0 + Time |  
> StudentID)
>    Data: test.data
>    AIC   BIC logLik deviance REMLdev
>  430.9 446.5 -209.4    418.4   418.9
> Random effects:
>  Groups    Name        Variance   Std.Dev.
>  StudentID (Intercept) 4.2186e-13 6.4951e-07
>  StudentID Time        1.8380e+00 1.3557e+00
>  Residual              1.6301e+00 1.2768e+00
> Number of obs: 100, groups: StudentID, 25
>
> Fixed effects:
>             Estimate Std. Error t value
> (Intercept)  70.7705     0.4204  168.33
> GroupID2      4.0248     0.5854    6.88
> Time          4.5292     0.2942   15.39
>
> Correlation of Fixed Effects:
>          (Intr) GrpID2
> GroupID2 -0.668
> Time     -0.264  0.000
>
> I didn't check the quality of the fit, but on the surface it seems  
> to be more stable, FWIW. Perhaps one could also add a term (GroupID  
> | StudentID), but I don't know offhand if that would make any sense.  
> Another issue to consider is whether to fit by REML or ML, but that  
> is secondary to getting the form of the model equation right. I  
> don't claim this as a final model, but rather a 're-starting point'.  
> It may well be in need of improvement, so comments are welcome.
>
> The confusion between subjects nested in time or vice versa has  
> occurred several times this week with respect to repeated measures/ 
> longitudinal models using lmer(), so perhaps it merits a comment:  
> subjects/experimental units are NOT nested in time. Measurements  
> taken on an individual at several time points *entails* that time be  
> nested within subject. Just saying...
>
> This discussion may be better continued on the R-sig-mixed list, so  
> I've cc-ed to that group as well.
>
> HTH,
> Dennis
>
> On Wed, Aug 25, 2010 at 1:27 AM, Bertolt Meyer  
> <bmeyer at sozpsy.uzh.ch> wrote:
> Ben Bolker <bbolker <at> gmail.com> writes:
>
> Bertolt Meyer <bmeyer <at> sozpsy.uzh.ch> writes:
>
>
> Hello lmer() - users,
>
> A call to the lmer() function causes my installation of R (2.11.1 on
> Mac OS X 10.5.8) to crash and I am trying to figure out the problem.
>
>  [snip snip]
>
> detach("package:nlme")
> library(lme4)
>
> mod1 <- lmer(performance ~ time + (time | GroupID/StudentNumber), data
> = dataset.long, na.action = na.omit)
>
> However, this call results in a segfault:
>
> *** caught segfault ***
> address 0x154c3000, cause 'memory not mapped'
>
> and a lengthy traceback. I can reproduce this error. It also occurs
> when I don't load nlme before lme4. Can someone tell me what I am
> doing wrong? Any help is greatly appreciated.
>
>  This may well be a bug in lmer.  There have been a number of
> fussy computational issues with the lme4 package on the Mac platform.
>
> Ben, thanks for your reply. I tried to replicate this issue with a  
> small clean data set on a windows machine. You can find the code for  
> the data frame (100 observations from my data) at the end of this  
> mail. Very simple: four test scores per student over time, and  
> students are nested in groups. On my Windows installation, lmer()  
> throws an error that does not seem to get caught on the Mac,  
> resulting in the segfault:
>
> library(lme4)
> mlmoded1.lmer <- lmer(Score ~ Time + (Time | GroupID/StudentID),  
> data = test.data)
>
> Error: length(f1) == length(f2) is not TRUE
> Addditional Warnings:
> 1: In StudentID:GroupID :
>  numeric expression has 100 elements: only first one is used
> 2: In StudentID:GroupID :
>  numeric expression has 100 elements: only first one is used
>
> It seems to me that I am committing a trivial error here and that I  
> am too blind to see it. Any ideas?
>
> Regards,
> Bertolt
>
> If it is at all possible, please (1) post the results of sessionInfo()
> [which will in particular specify which version of lme4 you are  
> using];
> (2) possibly try this with the latest development version of lme4,  
> from
> R-forge, if that's feasible (it might be necessary to build the  
> package
> from source), and most importantly:
>
> (3) create a reproducible (for others) example -- most easily by
> posting your data on the web somewhere, but if that isn't possible
> by simulating data similar to yours (if it doesn't happen with another
> data set of similar structure, that's a clue -- it says it's some more
> particular characteristic of your data that triggers the problem) and
>
> (4) post to to *either* the R-sig-mac or the R-sig-mixed-models list,
> where the post is more likely to come to the attention of those who
> can help diagnose/fix ...
>
>  good luck
>   Ben Bolker
>
>
>
> test.data <- data.frame(c(17370, 17370, 17370, 17370, 17379, 17379,  
> 17379, 17379, 17387, 17387, 17387, 17387, 17391, 17391, 17391,  
> 17391, 17392, 17392, 17392, 17392, 17394, 17394, 17394, 17394,  
> 17408, 17408, 17408, 17408, 17419, 17419, 17419, 17419, 17429,  
> 17429, 17429, 17429, 17432, 17432, 17432, 17432, 17436, 17436,  
> 17436, 17436, 17439, 17439, 17439, 17439, 17470, 17470, 17470,  
> 17470, 17220, 17220, 17220, 17220, 17348, 17348, 17348, 17348,  
> 17349, 17349, 17349, 17349, 17380, 17380, 17380, 17380, 17398,  
> 17398, 17398, 17398, 17400, 17400, 17400, 17400, 17402, 17402,  
> 17402, 17402, 17403, 17403, 17403, 17403, 17413, 17413, 17413,  
> 17413, 17416, 17416, 17416, 17416, 17420, 17420, 17420, 17420,  
> 17421, 17421, 17421, 17421), c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,  
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,  
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,  
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,  
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), c(1, 2, 3,  
> 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2,  
> 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1,  
> 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4,  
> 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3,  
> 4, 1, 2, 3, 4), c(76.76, 81.83, 89.78, 92.82, 75.86, 81.84, 88.96,  
> 92.28, 75.28, 80.68, 88.62, 92.29, 76.60, 84.59, 92.03, 94.05,  
> 75.57, 79.94, 86.11, 90.25, 74.54, 81.42, 87.50, 90.71, 76.02,  
> 83.68, 91.11, 94.14, 76.31, 83.76, 90.44, 94.58, 72.29, 80.51,  
> 86.09, 90.41, 74.99, 82.28, 88.77, 92.26, 75.28, 81.92, 89.25,  
> 92.64, 76.31, 83.93, 91.00, 94.60, 76.31, 82.44, 90.57, 95.17,  
> 76.94, 82.21, 83.81, 85.00, 79.96, 81.92, 86.32, 90.05, 82.01,  
> 84.81, 88.79, 93.10, 77.87, 82.94, 86.86, 90.31, 77.87, 79.64,  
> 85.66, 86.97, 79.35, 80.44, 84.26, 83.62, 79.06, 81.56, 85.00,  
> 87.43, 79.34, 81.47, 83.23, 86.86, 79.44, 80.37, 84.36, 89.11,  
> 78.77, 81.02, 81.60, 87.21, 75.75, 79.35, 80.38, 86.87, 76.04,  
> 80.57, 83.36, 86.31))
>
> names(test.data) <- c("StudentID", "GroupID", "Time", "Score")
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>