[R] lmer() causes segfault
Bertolt Meyer
bmeyer at sozpsy.uzh.ch
Wed Aug 25 17:04:30 CEST 2010
Dennis,
just wow. Thank you so much. I knew it was something trivial - in this
case the variable type of the of the grouping variables. However,
something as trivial as this should not throw a segfault IMHO. I tried
subscribing to R-sig-mixed this morning, but the corresponding mail
server at the ETH's stats department seems to be down. And thank you
so much for changing the model, that is a great new starting point.
Can you recommend a good book that deals with multilevel models in
lmer() that include longitudinal data? I was not aware of the
difference between scalar random effects and random slopes and would
like to read up on that.
Again, thanks a lot.
Regards,
Bertolt
Am 25.08.2010 um 13:47 schrieb Dennis Murphy:
> Hi:
>
> Let's start with the data:
>
> > str(test.data)
> 'data.frame': 100 obs. of 4 variables:
> $ StudentID: num 17370 17370 17370 17370 17379 ...
> $ GroupID : num 1 1 1 1 1 1 1 1 1 1 ...
> $ Time : num 1 2 3 4 1 2 3 4 1 2 ...
> $ Score : num 76.8 81.8 89.8 92.8 75.9 ...
>
> Both StudentID and GroupID are numeric; in the model, they would be
> treated as continuous covariates rather than factors, so we need to
> convert:
>
> test.data$StudentID <- factor(test.data$StudentID)
> test.data$GroupID <- factor(test.data$GroupID)
>
> Secondly, I believe there are some flaws in your model. After
> converting your variables to factors, I ran
>
> library(lme4)
> mlmoded1.lmer <- lmer(Score ~ Time + (Time | GroupID/StudentID),
> data = test.data)
>
> You have two groups, so they should be treated as a fixed effect -
> more specifically, as a fixed blocking factor. The StudentIDs are
> certainly nested within GroupID, and Time is measured on each
> StudentID, so it is a repeated measures factor. The output of this
> model is
> > mlmoded1.lmer
> Linear mixed model fit by REML
> Formula: Score ~ Time + (Time | GroupID/StudentID)
> Data: test.data
> AIC BIC logLik deviance REMLdev
> 393.1 416.5 -187.5 376.9 375.1
> Random effects:
> Groups Name Variance Std.Dev. Corr
> StudentID:GroupID (Intercept) 0.504131 0.71002
> Time 0.083406 0.28880 1.000
> GroupID (Intercept) 12.809567 3.57905
> Time 3.897041 1.97409 -1.000
> Residual 1.444532 1.20189
> Number of obs: 100, groups: StudentID:GroupID, 25; GroupID, 2
>
> Fixed effects:
> Estimate Std. Error t value
> (Intercept) 72.803 2.552 28.530
> Time 4.474 1.401 3.193
>
> Correlation of Fixed Effects:
> (Intr)
> Time -0.994
>
> The high correlations among the random effects and then among the
> fixed effects suggests that the model specification may be a bit off.
>
> The above model fits random slopes to GroupIDs and StudentIDs, along
> with random intercepts, but GroupID is a between-subject effect and
> should be at the top level. Time is a within-subject effect and
> StudentIDs are the observational units. I modified the model to
> provide fixed effects for GroupIDs, scalar random effects for
> StudentIDs and random slopes for StudentIDs.
>
> > mod3 <- lmer(Score ~ 1 + GroupID + Time + (1 | StudentID) +
> + (0 + Time | StudentID), data = test.data)
> > mod3
> Linear mixed model fit by REML
> Formula: Score ~ 1 + GroupID + Time + (1 | StudentID) + (0 + Time |
> StudentID)
> Data: test.data
> AIC BIC logLik deviance REMLdev
> 430.9 446.5 -209.4 418.4 418.9
> Random effects:
> Groups Name Variance Std.Dev.
> StudentID (Intercept) 4.2186e-13 6.4951e-07
> StudentID Time 1.8380e+00 1.3557e+00
> Residual 1.6301e+00 1.2768e+00
> Number of obs: 100, groups: StudentID, 25
>
> Fixed effects:
> Estimate Std. Error t value
> (Intercept) 70.7705 0.4204 168.33
> GroupID2 4.0248 0.5854 6.88
> Time 4.5292 0.2942 15.39
>
> Correlation of Fixed Effects:
> (Intr) GrpID2
> GroupID2 -0.668
> Time -0.264 0.000
>
> I didn't check the quality of the fit, but on the surface it seems
> to be more stable, FWIW. Perhaps one could also add a term (GroupID
> | StudentID), but I don't know offhand if that would make any sense.
> Another issue to consider is whether to fit by REML or ML, but that
> is secondary to getting the form of the model equation right. I
> don't claim this as a final model, but rather a 're-starting point'.
> It may well be in need of improvement, so comments are welcome.
>
> The confusion between subjects nested in time or vice versa has
> occurred several times this week with respect to repeated measures/
> longitudinal models using lmer(), so perhaps it merits a comment:
> subjects/experimental units are NOT nested in time. Measurements
> taken on an individual at several time points *entails* that time be
> nested within subject. Just saying...
>
> This discussion may be better continued on the R-sig-mixed list, so
> I've cc-ed to that group as well.
>
> HTH,
> Dennis
>
> On Wed, Aug 25, 2010 at 1:27 AM, Bertolt Meyer
> <bmeyer at sozpsy.uzh.ch> wrote:
> Ben Bolker <bbolker <at> gmail.com> writes:
>
> Bertolt Meyer <bmeyer <at> sozpsy.uzh.ch> writes:
>
>
> Hello lmer() - users,
>
> A call to the lmer() function causes my installation of R (2.11.1 on
> Mac OS X 10.5.8) to crash and I am trying to figure out the problem.
>
> [snip snip]
>
> detach("package:nlme")
> library(lme4)
>
> mod1 <- lmer(performance ~ time + (time | GroupID/StudentNumber), data
> = dataset.long, na.action = na.omit)
>
> However, this call results in a segfault:
>
> *** caught segfault ***
> address 0x154c3000, cause 'memory not mapped'
>
> and a lengthy traceback. I can reproduce this error. It also occurs
> when I don't load nlme before lme4. Can someone tell me what I am
> doing wrong? Any help is greatly appreciated.
>
> This may well be a bug in lmer. There have been a number of
> fussy computational issues with the lme4 package on the Mac platform.
>
> Ben, thanks for your reply. I tried to replicate this issue with a
> small clean data set on a windows machine. You can find the code for
> the data frame (100 observations from my data) at the end of this
> mail. Very simple: four test scores per student over time, and
> students are nested in groups. On my Windows installation, lmer()
> throws an error that does not seem to get caught on the Mac,
> resulting in the segfault:
>
> library(lme4)
> mlmoded1.lmer <- lmer(Score ~ Time + (Time | GroupID/StudentID),
> data = test.data)
>
> Error: length(f1) == length(f2) is not TRUE
> Addditional Warnings:
> 1: In StudentID:GroupID :
> numeric expression has 100 elements: only first one is used
> 2: In StudentID:GroupID :
> numeric expression has 100 elements: only first one is used
>
> It seems to me that I am committing a trivial error here and that I
> am too blind to see it. Any ideas?
>
> Regards,
> Bertolt
>
> If it is at all possible, please (1) post the results of sessionInfo()
> [which will in particular specify which version of lme4 you are
> using];
> (2) possibly try this with the latest development version of lme4,
> from
> R-forge, if that's feasible (it might be necessary to build the
> package
> from source), and most importantly:
>
> (3) create a reproducible (for others) example -- most easily by
> posting your data on the web somewhere, but if that isn't possible
> by simulating data similar to yours (if it doesn't happen with another
> data set of similar structure, that's a clue -- it says it's some more
> particular characteristic of your data that triggers the problem) and
>
> (4) post to to *either* the R-sig-mac or the R-sig-mixed-models list,
> where the post is more likely to come to the attention of those who
> can help diagnose/fix ...
>
> good luck
> Ben Bolker
>
>
>
> test.data <- data.frame(c(17370, 17370, 17370, 17370, 17379, 17379,
> 17379, 17379, 17387, 17387, 17387, 17387, 17391, 17391, 17391,
> 17391, 17392, 17392, 17392, 17392, 17394, 17394, 17394, 17394,
> 17408, 17408, 17408, 17408, 17419, 17419, 17419, 17419, 17429,
> 17429, 17429, 17429, 17432, 17432, 17432, 17432, 17436, 17436,
> 17436, 17436, 17439, 17439, 17439, 17439, 17470, 17470, 17470,
> 17470, 17220, 17220, 17220, 17220, 17348, 17348, 17348, 17348,
> 17349, 17349, 17349, 17349, 17380, 17380, 17380, 17380, 17398,
> 17398, 17398, 17398, 17400, 17400, 17400, 17400, 17402, 17402,
> 17402, 17402, 17403, 17403, 17403, 17403, 17413, 17413, 17413,
> 17413, 17416, 17416, 17416, 17416, 17420, 17420, 17420, 17420,
> 17421, 17421, 17421, 17421), c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), c(1, 2, 3,
> 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2,
> 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1,
> 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4,
> 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3,
> 4, 1, 2, 3, 4), c(76.76, 81.83, 89.78, 92.82, 75.86, 81.84, 88.96,
> 92.28, 75.28, 80.68, 88.62, 92.29, 76.60, 84.59, 92.03, 94.05,
> 75.57, 79.94, 86.11, 90.25, 74.54, 81.42, 87.50, 90.71, 76.02,
> 83.68, 91.11, 94.14, 76.31, 83.76, 90.44, 94.58, 72.29, 80.51,
> 86.09, 90.41, 74.99, 82.28, 88.77, 92.26, 75.28, 81.92, 89.25,
> 92.64, 76.31, 83.93, 91.00, 94.60, 76.31, 82.44, 90.57, 95.17,
> 76.94, 82.21, 83.81, 85.00, 79.96, 81.92, 86.32, 90.05, 82.01,
> 84.81, 88.79, 93.10, 77.87, 82.94, 86.86, 90.31, 77.87, 79.64,
> 85.66, 86.97, 79.35, 80.44, 84.26, 83.62, 79.06, 81.56, 85.00,
> 87.43, 79.34, 81.47, 83.23, 86.86, 79.44, 80.37, 84.36, 89.11,
> 78.77, 81.02, 81.60, 87.21, 75.75, 79.35, 80.38, 86.87, 76.04,
> 80.57, 83.36, 86.31))
>
> names(test.data) <- c("StudentID", "GroupID", "Time", "Score")
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list