[R-sig-ME] Timings with SAMM, lme4, nlme
Kevin Wright
kw.statr at gmail.com
Fri Feb 2 17:33:12 CET 2007
(1) In short, I can find essentially no difference in the results of the
samm and lmer models (the most complex one that both functions could
handle):
lmer: math~ gr + sx + eth + cltype + (1+yrs|id) + (1|sch)
samm: math ~ gr + sx + eth + cltype, random=~ us(link(~1+yrs)):id + sch
After several hours searching through the return values of the two functions
(slots, S3 methods, S4 methods, extractors and all that) and identifying
unique approaches (lmer uses polynomial contrasts for ordered factors and
samm uses treatment contrasts; missing values in the data are handled
slightly differently in the way the random effects return values are
presented), I find the two functions have nearly identical variance
components and essentially identical fixed effects and random effects. An R
transcript is attached for reference.
(2) Yes, SAMM is proprietary. It is available on Windows and Linux, S-Plus
and R. The developer has told me that version 2 is very near completion.
If you ever want to try it out, there is a 30-day free trial before it stops
working.
I use lme4 because it is open-source and has a good community of users,
published examples, etc. I use samm to analyze data from plant breeding
experiments (current literature methods use large data sets, crossed random
effects, heteroskedasticity, spatial correlation, etc.). SAMM also has
convenient reporting of linear predictions of BLUEs/BLUPs (Welham , Cullis,
Gogel, Gilmour, & Thompson 2004, Stroup & Mulitze 1991) for
decision-making. I also use both because fitting the same model using two
different software packages (when the software capabilities allow for it)
really helps me think carefully and hard about what I'm asking the software
to do, what it actually does, and what the results actually mean.
Thanks for the challenging question...I learned more because of it.
Kevin
(thread edited below for brevity)
Doug Bates wrote:
I'm not sure if I have access to SAMM to check the timings. Is SAMM
> proprietary? My recollection is that it is proprietary and that it is
> Windows-only. Either one of those characteristics is the kiss of
> death for my wanting to use it.
>
> Are you sure that SAMM is fitting a model with partially-crossed
> random effects? If you are only considering students and schools as
> grouping factors for the random effects there will be little
> difference between assuming nested random effects and correcting for
> the partial crossing. This is because there are very few students in
> this study who attend more than one school. If you considered
> teachers as well as students and schools as grouping factors then you
> could more easily discern the difference in model fits taking into
> account the crossing or assuming that each student:teacher:school
> combination is unique.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20070202/36df8a8d/attachment.html>
-------------- next part --------------
R> library(mlmRev)
Loading required package: lme4
Loading required package: Matrix
R> library(samm)
Attaching package: 'samm'
R> library(kw) # For ranef.samm, fixef.samm, VarCorr.samm
# samm and lmer treat missings slightly differently in the return value,
# so level the playing field by removing all observations with missing values
R> star2 <- subset(star, !(is.na(math) | is.na(gr) | is.na(sx) | is.na(eth) |
+ is.na(cltype) | is.na(yrs) | is.na(id) | is.na(sch)))
# In the 'star' data, gr is an "ordered factor"
# lmer uses polynomial contrasts for ordered factors by default
# samm uses treatment contrasts for ordered factors
# Make 'gr' into an unordered factor
R> star2$gr <- as.factor(as.character(star2$gr))
# Fit the models. lmer: 10.4 seconds lmer2: 11.3 samm: 6.0
R> system.time(f.lmer <- lmer(math~ gr + sx + eth + cltype +
+ (1+yrs|id) + (1|sch), data=star2,
+ control=list(gradient = FALSE, niterEM = 0)))
[1] 16.13 0.40 18.22 NA NA
R> system.time(f.lmer2 <- lmer2(math~ gr + sx + eth + cltype +
+ (1+yrs|id) + (1|sch), data=star2))
[1] 17.00 0.53 18.50 NA NA
R> system.time(f.samm <- samm(math ~ gr + sx + eth + cltype,
+ random=~ us(link(~1+yrs)):id + sch ,
+ data=star2))
+-----------------------------------------------------------------+
Spatial Analysis <> Mixed Models Version: VSN 1.10
AIsamm license expires: 65535 days
SAMM Convergence Monitoring: Thu Feb 01 17:08:08 2007
+-----------------------------------------------------------------+
Seq Component
1 link("~1 + yrs"):id!Intercept:Intercept
2 link("~1 + yrs"):id!yrs:Intercept
3 link("~1 + yrs"):id!yrs:yrs
4 sch
5 R!variance
Equations: 21560 (17 dense)
Initial update shrinkage factor: 0.316
Singularities: 4 (more)
LogLik S2 DF 1 2 3 4 5
-101615.5560 898.1042 24566 0.150 0.100 0.150 0.100 1.000 17:08:09
-100808.5510 796.0875 24566 0.435 0.071 0.118 0.125 1.000 17:08:10
-100144.8784 689.3688 24566 0.966 -0.005 0.101 0.167 1.000 17:08:10
-99855.8696 603.4995 24566 1.786 -0.155 0.113 0.214 1.000 17:08:11
-99838.3681 584.1043 24566 2.086 -0.228 0.128 0.212 1.000 17:08:11
-99838.0609 583.0869 24566 2.119 -0.240 0.130 0.209 1.000 17:08:12
-99838.0569 583.1879 24566 2.121 -0.241 0.130 0.209 1.000 17:08:12
FINAL parameter values: 2.121 -0.241 0.130 0.209 1.000
Constraint codes: U U U P P
Exit status: 0 - LogLikelihood Converged
Finished on: Thu Feb 01 17:08:12 2007
Convergence monitoring: LogLikelihood Converged
[1] 5.70 0.02 6.38 NA NA
# Use results of lmer, since extractor functions seem easier to use
# Variance components for samm
R> kw::VarCorr(f.samm)
Response: math
Component Std Err Z Ratio Constraint
link("~1 + yrs"):id!Intercept:Intercept 1236.8585 30.3504 40.753 Unconstrained
link("~1 + yrs"):id!yrs:Intercept -140.5416 10.4943 -13.392 Unconstrained
link("~1 + yrs"):id!yrs:yrs 75.8448 5.0346 15.065 Unconstrained
sch 121.6656 21.1143 5.762 Positive
R!variance 583.1879 8.6846 67.152 Positive
# Variance components for lmer
R> VarCorr(f.lmer)$id
2 x 2 Matrix of class "dpoMatrix"
(Intercept) yrs
(Intercept) 1236.9322 -140.56470
yrs -140.5647 75.85085
R> VarCorr(f.lmer)$sch
1 x 1 Matrix of class "dpoMatrix"
(Intercept)
(Intercept) 121.6565
R> attr(VarCorr(f.lmer),"sc")^2
scale
583.2067
# Fixed effects for samm
R> kw::fixef(f.samm)
cltype_small cltype_reg cltype_reg+A eth_W eth_B eth_A
0.000000 -7.340255 -6.054712 0.000000 -22.934383 2.554495
eth_H eth_I eth_O sx_M sx_F gr_1
1.410943 -34.301117 3.402846 0.000000 2.797615 0.000000
gr_2 gr_3 gr_K (Intercept)
47.993067 83.587681 -44.421802 539.051085
# Fixed effects for lmer
R> fixef(f.lmer)
(Intercept) gr2 gr3 grK sxF ethB
539.051209 47.993546 83.588789 -44.421994 2.797348 -22.934191
ethA ethH ethI ethO cltypereg cltypereg+A
2.556208 1.411219 -34.300729 3.405428 -7.340299 -6.054774
# Correlation of fixed effects for samm/lmer
R> fs <- kw::fixef(f.samm)
R> fs <- fs[abs(fs)>.001]
R> cor(sort(fs), sort(fixef(f.lmer)))
[1] 1
R> # samm random intercept for each school
R> as.vector(tail(kw::ranef(f.samm),80))
[1] 1.3143717 -20.5348662 3.2968950 -1.0183895 -21.1174158 -17.1111596
[7] 8.1185468 -3.5960209 1.9353613 21.5313533 19.7123450 13.1381558
[13] 5.1417748 6.4959617 -4.9180225 -9.1269814 -8.5555231 -8.0622654
[19] -12.9893362 7.3534878 -8.7966736 0.1812109 20.5266225 11.7676528
[25] -8.8642947 -19.4369518 11.1248498 -4.7711922 17.2450765 8.9582622
[31] -2.9094668 -13.8162709 -9.0721915 -7.1982139 -10.0674440 4.3567975
[37] 2.9597060 -7.9531537 3.9690027 -2.6906443 25.1882796 -6.8958274
[43] 3.0637880 -9.1939666 5.5059775 -1.9412242 -7.6311463 -11.8170561
[49] -7.0021395 -5.6523660 4.6034859 6.9618590 -9.6490649 2.2870641
[55] -12.6429433 -16.4871068 -10.5297346 -0.6011866 -0.3508403 -13.0738001
[61] 9.2144515 -3.0903109 9.7637953 0.4114071 9.0939491 1.9779132
[67] -3.2119299 12.8595955 17.8323498 0.8572007 -8.1456880 15.8826757
[73] 5.1027719 23.8611109 1.9011668 -6.3425222 5.6704674 -2.2273366
[79] 9.7215416 -1.7956171
# lmer random intercept for each school
R> lme4:::ranef(f.lmer)[[2]][,1]
[1] 1.3133874 -20.5335295 3.2956710 -1.0160289 -21.1169777 -17.1095972
[7] 8.1168346 -3.5945118 1.9352882 21.5310053 19.7115966 13.1401330
[13] 5.1383739 6.4961299 -4.9184228 -9.1255942 -8.5557981 -8.0617211
[19] -12.9903605 7.3527759 -8.7991525 0.1799393 20.5243880 11.7679624
[25] -8.8640390 -19.4362049 11.1219601 -4.7703302 17.2445306 8.9532094
[31] -2.9119062 -13.8155380 -9.0719066 -7.1963657 -10.0661642 4.3582431
[37] 2.9599896 -7.9522175 3.9701887 -2.6930150 25.1871108 -6.8951564
[43] 3.0654065 -9.1979970 5.5086618 -1.9394761 -7.6314905 -11.8180674
[49] -7.0025942 -5.6537153 4.6030342 6.9599747 -9.6486151 2.2865630
[55] -12.6427040 -16.4839846 -10.5288129 -0.6033224 -0.3515562 -13.0727761
[61] 9.2167562 -3.0882666 9.7630371 0.4125758 9.0944709 1.9818962
[67] -3.2120202 12.8576685 17.8320933 0.8574842 -8.1447511 15.8819324
[73] 5.1012380 23.8613070 1.9013746 -6.3435578 5.6695532 -2.2235653
[79] 9.7241380 -1.7960726
# Correlation of samm/lmer random intercepts for school
R> cor(lme4:::ranef(f.lmer)[[2]][,1], as.vector(tail(kw::ranef(f.samm),80)))
[1] 1
# lmer Random slope/intercept for each 'id'
R> head(lme4:::ranef(f.lmer)[[1]])
(Intercept) yrs
100017 89.893617 -10.21549112
100028 -36.049228 4.09662647
100045 6.318275 -0.00364208
100064 -9.404819 1.06876159
100070 62.249604 -6.72732399
100096 26.410811 0.76421193
# samm
R> data.frame(int=kw::ranef(f.samm)[1:6],
+ yrs=kw::ranef(f.samm)[10733:10738])
int yrs
id_100017 89.890923 -10.209949210
id_100028 -36.049490 4.094556472
id_100045 6.317053 -0.003151222
id_100064 -9.403919 1.068111606
id_100070 62.244643 -6.723240707
id_100096 26.408558 0.766162668
# The tail of the random effects from lme
R> tail(lme4:::ranef(f.lmer)[[1]])
(Intercept) yrs
99912 5.695326 -3.6083994
99937 13.972907 4.4756514
99942 13.718499 -1.5589673
99967 7.085396 -3.8997280
99979 5.132715 -0.5832806
99992 -4.265711 -5.6205076
# The tail of the random effects from samm
R> data.frame(int=kw::ranef(f.samm)[10727:10732],
+ yrs=kw::ranef(f.samm)[21459:21464])
int yrs
id_99912 5.693616 -3.6080847
id_99937 13.973628 4.4767916
id_99942 13.716419 -1.5579320
id_99967 7.082693 -3.8993951
id_99979 5.131698 -0.5828662
id_99992 -4.265344 -5.6211783
# Correlation of samm/lmer 'id' random effects
R> cor(lme4:::ranef(f.lmer)[[1]][,1], kw::ranef(f.samm)[1:10732])
[1] 1
R> cor(lme4:::ranef(f.lmer)[[1]][,2], kw::ranef(f.samm)[10733:21464])
[1] 1
More information about the R-sig-mixed-models
mailing list