[R-sig-ME] Desperately seeking help with simple lmer fit (lmer vs Proc Mixed)
Douglas Bates
bates at stat.wisc.edu
Wed Feb 18 20:56:56 CET 2009
Thanks for the example.
The estimates of the variance components are the result of an
optimization and it appears that there are two local optima for this
problem. I'm sorry to say that the optimum determined by SAS PROC
MIXED represents a better fit (a lower deviance) than the one
determined by lmer.
If you plot the response versus run, say
dotplot(reorder(Run, Y) ~ Y)
you will see that there is one run (number 13) with unusually large Y,
which may have abnormal influence. I was able to reproduce the SAS
results by modifying the starting estimates. This is not optimal
because it means you need to know the correct answer before you can
calculate it.
> (fm1 <- lmer(Y ~ (1|Run) + (1|Prep), pr, verbose = TRUE))
0: 1536.6629: 0.666667 0.384900
1: 1441.4487: 1.44044 1.01837
2: 1421.4662: 2.35838 0.621655
3: 1408.1762: 3.27848 0.229969
4: 1401.9874: 4.04888 0.00000
5: 1399.8678: 4.59188 0.00000
6: 1398.9477: 5.12145 0.00000
7: 1398.7722: 5.41633 0.00000
8: 1398.7543: 5.53332 0.00000
9: 1398.7539: 5.55421 0.00000
10: 1398.7539: 5.55527 1.68436e-07
11: 1398.7539: 5.55527 1.68436e-07
Linear mixed model fit by REML
Formula: Y ~ (1 | Run) + (1 | Prep)
Data: pr
AIC BIC logLik deviance REMLdev
1407 1417 -699.4 1410 1399
Random effects:
Groups Name Variance Std.Dev.
Run (Intercept) 3.5859e+05 5.9882e+02
Prep (Intercept) 3.2965e-10 1.8156e-05
Residual 1.1619e+04 1.0779e+02
Number of obs: 108, groups: Run, 18; Prep, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 28216.5 141.5 199.4
> (fm1a <- lmer(Y ~ (1|Run) + (1|Prep), pr, verbose = TRUE, start = c(4,4)))
0: 1393.6144: 4.00000 4.00000
1: 1393.5078: 3.74366 4.09884
2: 1393.4819: 3.72800 4.32664
3: 1393.4810: 3.73535 4.37362
4: 1393.4810: 3.73373 4.37885
5: 1393.4810: 3.73404 4.37877
6: 1393.4810: 3.73404 4.37875
7: 1393.4810: 3.73404 4.37875
Linear mixed model fit by REML
Formula: Y ~ (1 | Run) + (1 | Prep)
Data: pr
AIC BIC logLik deviance REMLdev
1401 1412 -696.7 1406 1393
Random effects:
Groups Name Variance Std.Dev.
Run (Intercept) 162010 402.50
Prep (Intercept) 222784 472.00
Residual 11619 107.79
Number of obs: 108, groups: Run, 18; Prep, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 28216 215 131.2
If you remove Run 13 you do get results like those from SAS using lmer
> (fm2 <- lmer(Y ~ (1|Run) + (1|Prep), pr, verbose = TRUE, subset = Run != 13))
0: 1426.8110: 0.666667 0.396059
1: 1342.0591: 1.47666 0.982502
2: 1325.5670: 2.35364 0.501989
3: 1311.1286: 3.60322 0.00000
4: 1309.3047: 3.96698 0.00000
5: 1307.8040: 4.56388 1.65053e-05
6: 1307.5200: 4.88366 6.39529e-05
7: 1307.4763: 5.04660 0.000170199
8: 1307.4746: 5.08457 0.000341660
9: 1307.4745: 5.08786 0.000627531
10: 1307.4745: 5.08806 0.00113168
11: 1307.4743: 5.09250 0.0198356
12: 1306.7915: 5.55085 2.00076
13: 1306.7202: 5.57778 2.29410
14: 1304.5962: 3.26016 2.72067
15: 1304.1219: 3.90630 4.98691
16: 1303.9334: 3.70822 4.85924
17: 1303.7839: 3.39994 4.44476
18: 1303.7039: 3.78966 4.10571
19: 1303.6539: 3.50693 3.67340
20: 1303.6333: 3.63992 3.74762
21: 1303.6285: 3.55042 3.87085
22: 1303.6259: 3.59760 3.87143
23: 1303.6258: 3.59381 3.86414
24: 1303.6258: 3.59257 3.86456
25: 1303.6258: 3.59308 3.86577
26: 1303.6258: 3.59318 3.86504
27: 1303.6258: 3.59312 3.86514
Linear mixed model fit by REML
Formula: Y ~ (1 | Run) + (1 | Prep)
Data: pr
Subset: Run != 13
AIC BIC logLik deviance REMLdev
1312 1322 -651.8 1316 1304
Random effects:
Groups Name Variance Std.Dev.
Run (Intercept) 135857 368.59
Prep (Intercept) 157206 396.49
Residual 10523 102.58
Number of obs: 102, groups: Run, 17; Prep, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 28161.8 185.5 151.8
This example may be motivation for me to try out a modification in the
optimization method that I have been contemplating.
On Wed, Feb 18, 2009 at 12:41 PM, David LeBlond
<David.LeBlond at abbott.com> wrote:
>
> Dear Dr. Bates,
>
> First, apologies for this request .... however I am desperate. I have spent
> 3 days on this. I must have a very fundamental misunderstanding of lmer or
> factors or grouped data or syntax or something else.
>
> To show due diligence, I have read all I can find on lmer syntax, gone
> through the SASmixed Package including the "lmer for SAS PROC MIXED Users",
> other presentations from yourself and others on lmer. I have visited the
> blogs. In the last 24h on this problem I have tried every permutation I can
> think of.
>
> The experiment is a simple balanced 3 level nested variance component study
> virtually identical to Example 4.4, page 156 of Little et al, SAS System for
> Mixed Models. There are 3 variables: There are 6 Y results nested within
> each of 18 runs. There are 3 runs nested within each of 6 preps. Site is
> ignored.
>
> I am using the lmer code analagous to what you used to analyze Example 4.4
> in lmer:
>
> lmer syntax: lmer(Y~1+(1|Prep/Run))
> MIXED syntax: proc mixed;class run prep site; model
> y=/s;random prep run(prep);run;
>
> The error variance components and fixed effect estimates agree, but the
> variance components for Prep and Run are completely different.
>
> Source lmer SAS
> Run 358580 162010
> Prep 0.012776 222784
>
> For a nearly identical problem, (Semi2 in SASmixed) lmer and MIXED give
> identical results. As I can repeat the SAS result in JMP, I trust it. I
> believe I am making some error in my use of lmer. I need to complete the
> analysis in lmer, not SAS or JMP.
>
> Can you or someone please explain to me why the variance component estimates
> are so radically different between MIXED and lmer? I know I am making some
> dumb mistake and I am prepared to be humbled.
>
> Thank you so much!!
>
> Dave
> ***************************************************************
> PS - Below I am sending my lmer data (a dif file), code and output along
> with the SAS data (in code), code and output.
>
>
> Here is the Excel data needed for R:
>
>
> Below is the R code which also contains the outputs and SAS data and code
> (sorry for length - 108 lines of data).
>
> library(lattice)
> library(foreign)
> library(lme4)
> Suitability<-read.DIF("C:/Documents and Settings/leblodj/Desktop/Synthroid
> Dissolution Method Change Lynn Davis Feb09/Suitability.dif",
> header=TRUE,transpose=TRUE,nrows=369,colClasses="character")
>
> str(Suitability)
> Y<-as.numeric(Suitability$Y)
> Site<-ordered(Suitability$Site)
> Prep<-ordered(Suitability$Prep)
> Run<-ordered(Suitability$Run)
> levels(Run)
> levels(Prep)
> lmer(Y~1+(1|Prep/Run))
>
> ####### lmer output #################
> #Linear mixed model fit by REML
> #Formula: Y ~ 1 + (1 | Prep/Run)
> # AIC BIC logLik deviance REMLdev
> # 1407 1417 -699.4 1410 1399
> #Random effects:
> # Groups Name Variance Std.Dev.
> # Run:Prep (Intercept) 3.5858e+05 598.81878
> # Prep (Intercept) 1.2776e-02 0.11303
> # Residual 1.1619e+04 107.79332
> #Number of obs: 108, groups: Run:Prep, 18; Prep, 6
> #
> #Fixed effects:
> # Estimate Std. Error t value
> #(Intercept) 28216.5 141.5 199.4
> #######################################
>
>
>
> #############################################
>
>
> ##### SAS Proc Mixed Output (matches JMP) ##
> ##### See full SAS code Below ##############
> # Covariance Parameter
> # Estimates
> #
> #Cov Parm Estimate
> #
> #Prep 222784
> #Run(Prep) 162010
> #Residual 11619
> Solution for Fixed Effects
> Standard
> Effect Estimate Error DF t Value Pr > |t|
> Intercept 28216 215.03 5 131.22 <.0001
> ############################################
> ###### SAS Proc Mixed Code ##########
> data Suitability;
> input Site$ Prep$ Run$ Y;
> cards;
> 2 1 1 28325
> 2 1 1 28314
> 2 1 1 28277
> 2 1 1 28183
> 2 1 1 28423
> 2 1 1 28299
> 2 1 2 27578
> 2 1 2 27724
> 2 1 2 27541
> 2 1 2 27686
> 2 1 2 27679
> 2 1 2 27567
> 2 1 3 27207
> 2 1 3 27351
> 2 1 3 27416
> 2 1 3 27292
> 2 1 3 27168
> 2 1 3 27117
> 2 2 4 28036
> 2 2 4 28042
> 2 2 4 27902
> 2 2 4 28110
> 2 2 4 27794
> 2 2 4 28181
> 2 2 5 28018
> 2 2 5 28318
> 2 2 5 28170
> 2 2 5 28410
> 2 2 5 28444
> 2 2 5 28620
> 2 2 6 27764
> 2 2 6 28008
> 2 2 6 27849
> 2 2 6 28096
> 2 2 6 27788
> 2 2 6 27884
> 2 3 7 28201
> 2 3 7 28224
> 2 3 7 28206
> 2 3 7 28462
> 2 3 7 28290
> 2 3 7 28303
> 2 3 8 28244
> 2 3 8 28125
> 2 3 8 28078
> 2 3 8 28023
> 2 3 8 28037
> 2 3 8 28014
> 2 3 9 28738
> 2 3 9 28726
> 2 3 9 28765
> 2 3 9 28534
> 2 3 9 28518
> 2 3 9 28489
> 1 4 10 28717
> 1 4 10 28622
> 1 4 10 28790
> 1 4 10 28697
> 1 4 10 28531
> 1 4 10 28502
> 1 4 11 28478
> 1 4 11 28425
> 1 4 11 28609
> 1 4 11 28541
> 1 4 11 28518
> 1 4 11 28492
> 1 4 12 28808
> 1 4 12 28764
> 1 4 12 28745
> 1 4 12 28786
> 1 4 12 28773
> 1 4 12 28622
> 1 5 13 29405
> 1 5 13 29372
> 1 5 13 29435
> 1 5 13 29831
> 1 5 13 29550
> 1 5 13 29402
> 1 5 14 28708
> 1 5 14 28684
> 1 5 14 28772
> 1 5 14 28705
> 1 5 14 28749
> 1 5 14 28754
> 1 5 15 28559
> 1 5 15 28584
> 1 5 15 28636
> 1 5 15 28713
> 1 5 15 28652
> 1 5 15 28530
> 1 6 16 26991
> 1 6 16 27009
> 1 6 16 26949
> 1 6 16 26957
> 1 6 16 26965
> 1 6 16 26934
> 1 6 17 28174
> 1 6 17 28074
> 1 6 17 28113
> 1 6 17 28154
> 1 6 17 28200
> 1 6 17 27952
> 1 6 18 27628
> 1 6 18 27569
> 1 6 18 27624
> 1 6 18 27737
> 1 6 18 27675
> 1 6 18 27654
> ;
>
> proc mixed;
> class run prep site;
> model y=/s;
> random prep run(prep);run;
>
>
> ________________________________
> David LeBlond
> Senior Research Statistician
> MS Statistics, MS/PhD Biochemistry
> Exploratory Statistics Global Pharmaceutical R&D
> 100 Abbott Park Road
> Abbott Park, IL 60064-3500
> USA R13-1 847-935-6031
> AP9A-1 847-935-1899
> David.LeBlond at abbott.com
>
> ________________________________
> This communication may contain information that is proprietary,
> confidential, or exempt from disclosure. If you are not the intended
> recipient, please note that any other dissemination, distribution, use or
> copying of this communication is strictly prohibited. Anyone who receives
> this message in error should notify the sender immediately by telephone or
> by return e-mail and delete it from his or her computer.
>
More information about the R-sig-mixed-models
mailing list