[R-sig-ME] Desperately seeking help with simple lmer fit (lmer vs Proc Mixed)

Wed Feb 18 21:20:58 CET 2009

On Wed, Feb 18, 2009 at 1:56 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
> Thanks for the example.
>
> The estimates of the variance components are the result of an
> optimization and it appears that there are two local optima for this
> problem.  I'm sorry to say that the optimum determined by SAS PROC
> MIXED represents a better fit (a lower deviance) than the one
> determined by lmer.
>
> If you plot the response versus run, say
>
> dotplot(reorder(Run, Y) ~ Y)
>
> you will see that there is one run (number 13) with unusually large Y,
> which may have abnormal influence.  I was able to reproduce the SAS
> results by modifying the starting estimates.  This is not optimal
> because it means you need to know the correct answer before you can
> calculate it.
>
>> (fm1 <- lmer(Y ~ (1|Run) + (1|Prep), pr, verbose = TRUE))
>  0:     1536.6629: 0.666667 0.384900
>  1:     1441.4487:  1.44044  1.01837
>  2:     1421.4662:  2.35838 0.621655
>  3:     1408.1762:  3.27848 0.229969
>  4:     1401.9874:  4.04888  0.00000
>  5:     1399.8678:  4.59188  0.00000
>  6:     1398.9477:  5.12145  0.00000
>  7:     1398.7722:  5.41633  0.00000
>  8:     1398.7543:  5.53332  0.00000
>  9:     1398.7539:  5.55421  0.00000
>  10:     1398.7539:  5.55527 1.68436e-07
>  11:     1398.7539:  5.55527 1.68436e-07
> Linear mixed model fit by REML
> Formula: Y ~ (1 | Run) + (1 | Prep)
>   Data: pr
>  AIC  BIC logLik deviance REMLdev
>  1407 1417 -699.4     1410    1399
> Random effects:
>  Groups   Name        Variance   Std.Dev.
>  Run      (Intercept) 3.5859e+05 5.9882e+02
>  Prep     (Intercept) 3.2965e-10 1.8156e-05
>  Residual             1.1619e+04 1.0779e+02
> Number of obs: 108, groups: Run, 18; Prep, 6
>
> Fixed effects:
>            Estimate Std. Error t value
> (Intercept)  28216.5      141.5   199.4
>> (fm1a <- lmer(Y ~ (1|Run) + (1|Prep), pr, verbose = TRUE, start = c(4,4)))
>  0:     1393.6144:  4.00000  4.00000
>  1:     1393.5078:  3.74366  4.09884
>  2:     1393.4819:  3.72800  4.32664
>  3:     1393.4810:  3.73535  4.37362
>  4:     1393.4810:  3.73373  4.37885
>  5:     1393.4810:  3.73404  4.37877
>  6:     1393.4810:  3.73404  4.37875
>  7:     1393.4810:  3.73404  4.37875
> Linear mixed model fit by REML
> Formula: Y ~ (1 | Run) + (1 | Prep)
>   Data: pr
>  AIC  BIC logLik deviance REMLdev
>  1401 1412 -696.7     1406    1393
> Random effects:
>  Groups   Name        Variance Std.Dev.
>  Run      (Intercept) 162010   402.50
>  Prep     (Intercept) 222784   472.00
>  Residual              11619   107.79
> Number of obs: 108, groups: Run, 18; Prep, 6
>
> Fixed effects:
>            Estimate Std. Error t value
> (Intercept)    28216        215   131.2
>
> If you remove Run 13 you do get results like those from SAS using lmer
>
>> (fm2 <- lmer(Y ~ (1|Run) + (1|Prep), pr, verbose = TRUE, subset = Run != 13))
>  0:     1426.8110: 0.666667 0.396059
>  1:     1342.0591:  1.47666 0.982502
>  2:     1325.5670:  2.35364 0.501989
>  3:     1311.1286:  3.60322  0.00000
>  4:     1309.3047:  3.96698  0.00000
>  5:     1307.8040:  4.56388 1.65053e-05
>  6:     1307.5200:  4.88366 6.39529e-05
>  7:     1307.4763:  5.04660 0.000170199
>  8:     1307.4746:  5.08457 0.000341660
>  9:     1307.4745:  5.08786 0.000627531
>  10:     1307.4745:  5.08806 0.00113168
>  11:     1307.4743:  5.09250 0.0198356
>  12:     1306.7915:  5.55085  2.00076
>  13:     1306.7202:  5.57778  2.29410
>  14:     1304.5962:  3.26016  2.72067
>  15:     1304.1219:  3.90630  4.98691
>  16:     1303.9334:  3.70822  4.85924
>  17:     1303.7839:  3.39994  4.44476
>  18:     1303.7039:  3.78966  4.10571
>  19:     1303.6539:  3.50693  3.67340
>  20:     1303.6333:  3.63992  3.74762
>  21:     1303.6285:  3.55042  3.87085
>  22:     1303.6259:  3.59760  3.87143
>  23:     1303.6258:  3.59381  3.86414
>  24:     1303.6258:  3.59257  3.86456
>  25:     1303.6258:  3.59308  3.86577
>  26:     1303.6258:  3.59318  3.86504
>  27:     1303.6258:  3.59312  3.86514
> Linear mixed model fit by REML
> Formula: Y ~ (1 | Run) + (1 | Prep)
>   Data: pr
>  Subset: Run != 13
>  AIC  BIC logLik deviance REMLdev
>  1312 1322 -651.8     1316    1304
> Random effects:
>  Groups   Name        Variance Std.Dev.
>  Run      (Intercept) 135857   368.59
>  Prep     (Intercept) 157206   396.49
>  Residual              10523   102.58
> Number of obs: 102, groups: Run, 17; Prep, 6
>
> Fixed effects:
>            Estimate Std. Error t value
> (Intercept)  28161.8      185.5   151.8
>
>
> This example may be motivation for me to try out a modification in the
> optimization method that I have been contemplating.

Well, the good news is that the modification was successful on this example

> (fm1 <- lmer(Y ~ 1 + (1|Run) + (1|Prep), pr, verbose = TRUE))
  0:     1536.6629: 0.816497 0.620403
  1:     1405.8868:  1.66562  1.14860
  2:     1399.0326:  1.71590  1.44355
  3:     1394.7782:  1.84561  1.71319
  4:     1393.9976:  1.79780  2.00855
  5:     1393.9690:  2.09290  2.05797
  6:     1393.5053:  1.96491  2.07162
  7:     1393.4829:  1.92326  2.08972
  8:     1393.4810:  1.93265  2.09030
  9:     1393.4810:  1.93240  2.09189
 10:     1393.4810:  1.93236  2.09254
 11:     1393.4810:  1.93237  2.09255
Linear mixed model fit by REML
Formula: Y ~ 1 + (1 | Run) + (1 | Prep)
   Data: pr
  AIC  BIC logLik deviance REMLdev
 1401 1412 -696.7     1406    1393
Random effects:
 Groups   Name        Variance Std.Dev.
 Run      (Intercept) 162010   402.50
 Prep     (Intercept) 222784   472.00
 Residual              11619   107.79
Number of obs: 108, groups: Run, 18; Prep, 6

Fixed effects:
            Estimate Std. Error t value
(Intercept)    28216        215   131.2

The bad news is that I made the modification in a development version
of the lme4 package and that version won't be ready for prime time for
a while.  I still have a lot of work to do on the generalized linear
mixed models.

> On Wed, Feb 18, 2009 at 12:41 PM, David LeBlond
> <David.LeBlond at abbott.com> wrote:
>>
>> Dear Dr. Bates,
>>
>> First, apologies for this request ....  however I am desperate. I have spent
>> 3 days on this. I must have a very fundamental misunderstanding of lmer or
>> factors or grouped data or syntax or something else.
>>
>> To show due diligence, I have read all I can find on lmer syntax, gone
>> through the SASmixed Package including the "lmer for SAS PROC MIXED Users",
>> other presentations from yourself and others on lmer. I have visited the
>> blogs.  In the last 24h on this problem I have tried every permutation I can
>> think of.
>>
>> The experiment is a simple balanced 3 level nested variance component study
>> virtually identical to Example 4.4, page 156 of Little et al, SAS System for
>> Mixed Models. There are 3 variables: There are 6 Y results nested within
>> each of 18 runs. There are 3 runs nested within each of 6 preps. Site is
>> ignored.
>>
>> I am using the lmer code analagous to what you used to analyze Example 4.4
>> in lmer:
>>
>>         lmer syntax:         lmer(Y~1+(1|Prep/Run))
>>         MIXED syntax:         proc mixed;class run prep site; model
>> y=/s;random prep run(prep);run;
>>
>> The error variance components and fixed effect estimates agree, but the
>> variance components for Prep and Run are completely different.
>>
>> Source                lmer                SAS
>> Run                   358580                162010
>> Prep                 0.012776        222784
>>
>> For a nearly identical problem, (Semi2 in SASmixed) lmer and MIXED give
>> identical results. As I can repeat the SAS result in JMP, I trust it. I
>> believe I am making some error in my use of lmer. I need to complete the
>> analysis in lmer, not SAS or JMP.
>>
>> Can you or someone please explain to me why the variance component estimates
>> are so radically different between MIXED and lmer? I know I am making some
>> dumb mistake and I am prepared to be humbled.
>>
>> Thank you so much!!
>>
>> Dave
>> ***************************************************************
>> PS - Below I am sending my lmer data (a dif file), code and output along
>> with the SAS data (in code), code and output.
>>
>>
>> Here is the Excel data needed for R:
>>
>>
>> Below is the R code which also contains the outputs and SAS data and code
>> (sorry for length - 108 lines of data).
>>
>> library(lattice)
>> library(foreign)
>> library(lme4)
>> Suitability<-read.DIF("C:/Documents and Settings/leblodj/Desktop/Synthroid
>> Dissolution Method Change Lynn Davis Feb09/Suitability.dif",
>>                header=TRUE,transpose=TRUE,nrows=369,colClasses="character")
>>
>> str(Suitability)
>> Y<-as.numeric(Suitability$Y)
>> Site<-ordered(Suitability$Site)
>> Prep<-ordered(Suitability$Prep)
>> Run<-ordered(Suitability$Run)
>> levels(Run)
>> levels(Prep)
>> lmer(Y~1+(1|Prep/Run))
>>
>> ####### lmer output #################
>> #Linear mixed model fit by REML
>> #Formula: Y ~ 1 + (1 | Prep/Run)
>> #  AIC  BIC logLik deviance REMLdev
>> # 1407 1417 -699.4     1410    1399
>> #Random effects:
>> # Groups   Name        Variance   Std.Dev.
>> # Run:Prep (Intercept) 3.5858e+05 598.81878
>> # Prep     (Intercept) 1.2776e-02   0.11303
>> # Residual             1.1619e+04 107.79332
>> #Number of obs: 108, groups: Run:Prep, 18; Prep, 6
>> #
>> #Fixed effects:
>> #            Estimate Std. Error t value
>> #(Intercept)  28216.5      141.5   199.4
>> #######################################
>>
>>
>>
>> #############################################
>>
>>
>> ##### SAS Proc Mixed Output (matches JMP)  ##
>> ##### See full SAS code Below  ##############
>> # Covariance Parameter
>> #      Estimates
>> #
>> #Cov Parm      Estimate
>> #
>> #Prep            222784
>> #Run(Prep)       162010
>> #Residual         11619
>>                    Solution for Fixed Effects
>>                          Standard
>> Effect       Estimate       Error      DF    t Value    Pr > |t|
>> Intercept       28216      215.03       5     131.22      <.0001
>> ############################################
>> ###### SAS Proc Mixed Code  ##########
>> data Suitability;
>> input Site$        Prep$        Run$        Y;
>> cards;
>> 2        1        1        28325
>> 2        1        1        28314
>> 2        1        1        28277
>> 2        1        1        28183
>> 2        1        1        28423
>> 2        1        1        28299
>> 2        1        2        27578
>> 2        1        2        27724
>> 2        1        2        27541
>> 2        1        2        27686
>> 2        1        2        27679
>> 2        1        2        27567
>> 2        1        3        27207
>> 2        1        3        27351
>> 2        1        3        27416
>> 2        1        3        27292
>> 2        1        3        27168
>> 2        1        3        27117
>> 2        2        4        28036
>> 2        2        4        28042
>> 2        2        4        27902
>> 2        2        4        28110
>> 2        2        4        27794
>> 2        2        4        28181
>> 2        2        5        28018
>> 2        2        5        28318
>> 2        2        5        28170
>> 2        2        5        28410
>> 2        2        5        28444
>> 2        2        5        28620
>> 2        2        6        27764
>> 2        2        6        28008
>> 2        2        6        27849
>> 2        2        6        28096
>> 2        2        6        27788
>> 2        2        6        27884
>> 2        3        7        28201
>> 2        3        7        28224
>> 2        3        7        28206
>> 2        3        7        28462
>> 2        3        7        28290
>> 2        3        7        28303
>> 2        3        8        28244
>> 2        3        8        28125
>> 2        3        8        28078
>> 2        3        8        28023
>> 2        3        8        28037
>> 2        3        8        28014
>> 2        3        9        28738
>> 2        3        9        28726
>> 2        3        9        28765
>> 2        3        9        28534
>> 2        3        9        28518
>> 2        3        9        28489
>> 1        4        10        28717
>> 1        4        10        28622
>> 1        4        10        28790
>> 1        4        10        28697
>> 1        4        10        28531
>> 1        4        10        28502
>> 1        4        11        28478
>> 1        4        11        28425
>> 1        4        11        28609
>> 1        4        11        28541
>> 1        4        11        28518
>> 1        4        11        28492
>> 1        4        12        28808
>> 1        4        12        28764
>> 1        4        12        28745
>> 1        4        12        28786
>> 1        4        12        28773
>> 1        4        12        28622
>> 1        5        13        29405
>> 1        5        13        29372
>> 1        5        13        29435
>> 1        5        13        29831
>> 1        5        13        29550
>> 1        5        13        29402
>> 1        5        14        28708
>> 1        5        14        28684
>> 1        5        14        28772
>> 1        5        14        28705
>> 1        5        14        28749
>> 1        5        14        28754
>> 1        5        15        28559
>> 1        5        15        28584
>> 1        5        15        28636
>> 1        5        15        28713
>> 1        5        15        28652
>> 1        5        15        28530
>> 1        6        16        26991
>> 1        6        16        27009
>> 1        6        16        26949
>> 1        6        16        26957
>> 1        6        16        26965
>> 1        6        16        26934
>> 1        6        17        28174
>> 1        6        17        28074
>> 1        6        17        28113
>> 1        6        17        28154
>> 1        6        17        28200
>> 1        6        17        27952
>> 1        6        18        27628
>> 1        6        18        27569
>> 1        6        18        27624
>> 1        6        18        27737
>> 1        6        18        27675
>> 1        6        18        27654
>> ;
>>
>> proc mixed;
>>   class run prep site;
>>   model y=/s;
>>   random prep run(prep);run;
>>
>>
>> ________________________________
>> David LeBlond
>> Senior Research Statistician
>> MS Statistics, MS/PhD Biochemistry
>> Exploratory Statistics Global Pharmaceutical R&D
>> 100 Abbott Park Road
>> Abbott Park, IL 60064-3500
>> USA R13-1 847-935-6031
>> AP9A-1 847-935-1899
>> David.LeBlond at abbott.com
>>
>> ________________________________
>> This communication may contain information that is proprietary,
>> confidential, or exempt from disclosure. If you are not the intended
>> recipient, please note that any other dissemination, distribution, use or
>> copying of this communication is strictly prohibited. Anyone who receives
>> this message in error should notify the sender immediately by telephone or
>> by return e-mail and delete it from his or her computer.
>>
>