[R-sig-ME] Timings with SAMM, lme4, nlme

Fri Feb 2 17:33:12 CET 2007

(1) In short, I can find essentially no difference in the results of the
samm and lmer models (the most complex one that both functions could
handle):

lmer: math~ gr + sx + eth + cltype + (1+yrs|id) + (1|sch)
samm: math ~ gr + sx + eth + cltype, random=~ us(link(~1+yrs)):id + sch

After several hours searching through the return values of the two functions
(slots, S3 methods, S4 methods, extractors and all that) and identifying
unique approaches (lmer uses polynomial contrasts for ordered factors and
samm uses treatment contrasts; missing values in the data are handled
slightly differently in the way the random effects return values are
presented), I find the two functions have nearly identical variance
components and essentially identical fixed effects and random effects.  An R
transcript is attached for reference.

(2) Yes, SAMM is proprietary.  It is available on Windows and Linux, S-Plus
and R.  The developer has told me that version 2 is very near completion.
If you ever want to try it out, there is a 30-day free trial before it stops
working.

I use lme4 because it is open-source and has a good community of users,
published examples, etc.  I use samm to analyze data from plant breeding
experiments (current literature methods use large data sets, crossed random
effects, heteroskedasticity, spatial correlation, etc.).  SAMM also has
convenient reporting of linear predictions of BLUEs/BLUPs (Welham , Cullis,
Gogel, Gilmour, & Thompson 2004, Stroup & Mulitze 1991) for
decision-making.  I also use both because fitting the same model using two
different software packages (when the software capabilities allow for it)
really helps me think carefully and hard about what I'm asking the software
to do, what it actually does, and what the results actually mean.

Thanks for the challenging question...I learned more because of it.

Kevin

(thread edited below for brevity)

Doug Bates wrote:

I'm not sure if I have access to SAMM to check the timings.  Is SAMM
> proprietary?  My recollection is that it is proprietary and that it is
> Windows-only.  Either one of those characteristics is the kiss of
> death for my wanting to use it.
>
> Are you sure that SAMM is fitting a model with partially-crossed
> random effects?  If you are only considering students and schools as
> grouping factors for the random effects there will be little
> difference between assuming nested random effects and correcting for
> the partial crossing.  This is because there are very few students in
> this study who attend more than one school.  If you considered
> teachers as well as students and schools as grouping factors then you
> could more easily discern the difference in model fits taking into
> account the crossing or assuming that each student:teacher:school
> combination is unique.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20070202/36df8a8d/attachment.html>
-------------- next part --------------

R> library(mlmRev)
Loading required package: lme4
Loading required package: Matrix
R> library(samm)
Attaching package: 'samm'
R> library(kw)  # For ranef.samm, fixef.samm, VarCorr.samm

# samm and lmer treat missings slightly differently in the return value,
# so level the playing field by removing all observations with missing values
R> star2 <- subset(star, !(is.na(math) | is.na(gr) | is.na(sx) | is.na(eth) |
+                         is.na(cltype) | is.na(yrs) | is.na(id) | is.na(sch)))

# In the 'star' data, gr is an "ordered factor"
# lmer uses polynomial contrasts for ordered factors by default
# samm uses treatment contrasts for ordered factors
# Make 'gr' into an unordered factor
R> star2$gr <- as.factor(as.character(star2$gr))

# Fit the models.  lmer: 10.4 seconds lmer2: 11.3  samm: 6.0
R> system.time(f.lmer <- lmer(math~ gr + sx + eth + cltype +
+                            (1+yrs|id) + (1|sch), data=star2,
+                            control=list(gradient = FALSE, niterEM = 0)))
[1] 16.13  0.40 18.22    NA    NA

R> system.time(f.lmer2 <- lmer2(math~ gr + sx + eth + cltype +
+                            (1+yrs|id) + (1|sch), data=star2))
[1] 17.00  0.53 18.50    NA    NA

R> system.time(f.samm <- samm(math ~ gr + sx + eth + cltype,
+                          random=~ us(link(~1+yrs)):id + sch ,
+                          data=star2))
+-----------------------------------------------------------------+
Spatial Analysis <> Mixed Models   Version: VSN 1.10
AIsamm license expires: 65535 days
SAMM Convergence Monitoring: Thu Feb 01 17:08:08 2007
+-----------------------------------------------------------------+

Seq  Component 
  1  link("~1 + yrs"):id!Intercept:Intercept
  2  link("~1 + yrs"):id!yrs:Intercept
  3  link("~1 + yrs"):id!yrs:yrs
  4  sch
  5  R!variance

Equations:  21560 (17 dense)
Initial update shrinkage factor: 0.316
Singularities: 4 (more)

     LogLik         S2      DF          1          2          3          4          5

-101615.5560    898.1042 24566      0.150      0.100      0.150      0.100      1.000  17:08:09
-100808.5510    796.0875 24566      0.435      0.071      0.118      0.125      1.000  17:08:10
-100144.8784    689.3688 24566      0.966     -0.005      0.101      0.167      1.000  17:08:10
 -99855.8696    603.4995 24566      1.786     -0.155      0.113      0.214      1.000  17:08:11
 -99838.3681    584.1043 24566      2.086     -0.228      0.128      0.212      1.000  17:08:11
 -99838.0609    583.0869 24566      2.119     -0.240      0.130      0.209      1.000  17:08:12
 -99838.0569    583.1879 24566      2.121     -0.241      0.130      0.209      1.000  17:08:12
FINAL   parameter values:           2.121     -0.241      0.130      0.209      1.000
Constraint codes:                       U          U          U          P          P

Exit status: 0 - LogLikelihood Converged
Finished on: Thu Feb 01 17:08:12 2007

Convergence monitoring: LogLikelihood Converged 
[1] 5.70 0.02 6.38   NA   NA

# Use results of lmer, since extractor functions seem easier to use

# Variance components for samm
R> kw::VarCorr(f.samm)
Response:  math 
                                        Component Std Err Z Ratio    Constraint
link("~1 + yrs"):id!Intercept:Intercept 1236.8585 30.3504  40.753 Unconstrained
link("~1 + yrs"):id!yrs:Intercept       -140.5416 10.4943 -13.392 Unconstrained
link("~1 + yrs"):id!yrs:yrs               75.8448  5.0346  15.065 Unconstrained
sch                                      121.6656 21.1143   5.762      Positive
R!variance                               583.1879  8.6846  67.152      Positive

# Variance components for lmer
R> VarCorr(f.lmer)$id
2 x 2 Matrix of class "dpoMatrix"
            (Intercept)        yrs
(Intercept)   1236.9322 -140.56470
yrs           -140.5647   75.85085

R> VarCorr(f.lmer)$sch
1 x 1 Matrix of class "dpoMatrix"
            (Intercept)
(Intercept)    121.6565

R> attr(VarCorr(f.lmer),"sc")^2
   scale 
583.2067 

# Fixed effects for samm
R> kw::fixef(f.samm)
cltype_small   cltype_reg cltype_reg+A        eth_W        eth_B        eth_A 
    0.000000    -7.340255    -6.054712     0.000000   -22.934383     2.554495 
       eth_H        eth_I        eth_O         sx_M         sx_F         gr_1 
    1.410943   -34.301117     3.402846     0.000000     2.797615     0.000000 
        gr_2         gr_3         gr_K  (Intercept) 
   47.993067    83.587681   -44.421802   539.051085 

# Fixed effects for lmer
R> fixef(f.lmer)
(Intercept)         gr2         gr3         grK         sxF        ethB 
 539.051209   47.993546   83.588789  -44.421994    2.797348  -22.934191 
       ethA        ethH        ethI        ethO   cltypereg cltypereg+A 
   2.556208    1.411219  -34.300729    3.405428   -7.340299   -6.054774 

# Correlation of fixed effects for samm/lmer
R> fs <- kw::fixef(f.samm)
R> fs <- fs[abs(fs)>.001]
R> cor(sort(fs), sort(fixef(f.lmer)))
[1] 1

R> # samm random intercept for each school
R> as.vector(tail(kw::ranef(f.samm),80))
 [1]   1.3143717 -20.5348662   3.2968950  -1.0183895 -21.1174158 -17.1111596
 [7]   8.1185468  -3.5960209   1.9353613  21.5313533  19.7123450  13.1381558
[13]   5.1417748   6.4959617  -4.9180225  -9.1269814  -8.5555231  -8.0622654
[19] -12.9893362   7.3534878  -8.7966736   0.1812109  20.5266225  11.7676528
[25]  -8.8642947 -19.4369518  11.1248498  -4.7711922  17.2450765   8.9582622
[31]  -2.9094668 -13.8162709  -9.0721915  -7.1982139 -10.0674440   4.3567975
[37]   2.9597060  -7.9531537   3.9690027  -2.6906443  25.1882796  -6.8958274
[43]   3.0637880  -9.1939666   5.5059775  -1.9412242  -7.6311463 -11.8170561
[49]  -7.0021395  -5.6523660   4.6034859   6.9618590  -9.6490649   2.2870641
[55] -12.6429433 -16.4871068 -10.5297346  -0.6011866  -0.3508403 -13.0738001
[61]   9.2144515  -3.0903109   9.7637953   0.4114071   9.0939491   1.9779132
[67]  -3.2119299  12.8595955  17.8323498   0.8572007  -8.1456880  15.8826757
[73]   5.1027719  23.8611109   1.9011668  -6.3425222   5.6704674  -2.2273366
[79]   9.7215416  -1.7956171

# lmer random intercept for each school
R> lme4:::ranef(f.lmer)[[2]][,1]
 [1]   1.3133874 -20.5335295   3.2956710  -1.0160289 -21.1169777 -17.1095972
 [7]   8.1168346  -3.5945118   1.9352882  21.5310053  19.7115966  13.1401330
[13]   5.1383739   6.4961299  -4.9184228  -9.1255942  -8.5557981  -8.0617211
[19] -12.9903605   7.3527759  -8.7991525   0.1799393  20.5243880  11.7679624
[25]  -8.8640390 -19.4362049  11.1219601  -4.7703302  17.2445306   8.9532094
[31]  -2.9119062 -13.8155380  -9.0719066  -7.1963657 -10.0661642   4.3582431
[37]   2.9599896  -7.9522175   3.9701887  -2.6930150  25.1871108  -6.8951564
[43]   3.0654065  -9.1979970   5.5086618  -1.9394761  -7.6314905 -11.8180674
[49]  -7.0025942  -5.6537153   4.6030342   6.9599747  -9.6486151   2.2865630
[55] -12.6427040 -16.4839846 -10.5288129  -0.6033224  -0.3515562 -13.0727761
[61]   9.2167562  -3.0882666   9.7630371   0.4125758   9.0944709   1.9818962
[67]  -3.2120202  12.8576685  17.8320933   0.8574842  -8.1447511  15.8819324
[73]   5.1012380  23.8613070   1.9013746  -6.3435578   5.6695532  -2.2235653
[79]   9.7241380  -1.7960726

# Correlation of samm/lmer random intercepts for school
R> cor(lme4:::ranef(f.lmer)[[2]][,1], as.vector(tail(kw::ranef(f.samm),80)))
[1] 1

# lmer Random slope/intercept for each 'id'
R> head(lme4:::ranef(f.lmer)[[1]])
       (Intercept)          yrs
100017   89.893617 -10.21549112
100028  -36.049228   4.09662647
100045    6.318275  -0.00364208
100064   -9.404819   1.06876159
100070   62.249604  -6.72732399
100096   26.410811   0.76421193

# samm 
R> data.frame(int=kw::ranef(f.samm)[1:6],
+            yrs=kw::ranef(f.samm)[10733:10738])
                 int           yrs
id_100017  89.890923 -10.209949210
id_100028 -36.049490   4.094556472
id_100045   6.317053  -0.003151222
id_100064  -9.403919   1.068111606
id_100070  62.244643  -6.723240707
id_100096  26.408558   0.766162668

# The tail of the random effects from lme
R> tail(lme4:::ranef(f.lmer)[[1]])
      (Intercept)        yrs
99912    5.695326 -3.6083994
99937   13.972907  4.4756514
99942   13.718499 -1.5589673
99967    7.085396 -3.8997280
99979    5.132715 -0.5832806
99992   -4.265711 -5.6205076

# The tail of the random effects from samm
R> data.frame(int=kw::ranef(f.samm)[10727:10732],
+            yrs=kw::ranef(f.samm)[21459:21464])
               int        yrs
id_99912  5.693616 -3.6080847
id_99937 13.973628  4.4767916
id_99942 13.716419 -1.5579320
id_99967  7.082693 -3.8993951
id_99979  5.131698 -0.5828662
id_99992 -4.265344 -5.6211783

# Correlation of samm/lmer 'id' random effects
R> cor(lme4:::ranef(f.lmer)[[1]][,1], kw::ranef(f.samm)[1:10732])
[1] 1
R> cor(lme4:::ranef(f.lmer)[[1]][,2], kw::ranef(f.samm)[10733:21464])
[1] 1