[R-sig-ME] Trouble Replicating Unstructured Mixed Procedure in R

Juliet Hannah juliet.hannah at gmail.com
Fri Jan 27 20:00:39 CET 2012


My post may be redundant, but just in case I wanted to comment.  I
think Kevin's post
addressed a few things that are helpful to start with.

The original proc mixed syntax did not have a random statement. This
is similar to what
Pinheiro and Bates (Chapter 5) describe as an extended linear model.
Therefore, it seems we should not be able
to reproduce this with  lme and nlme. Instead, we use gls. I think
some people do not like to call this
a mixed effect model because the intercept or slope are not random,
but I can see why SAS would include
it with proc mixed because of the similarities. The SAS reference for
this is Littell (SAS for mixed models) Chapter 5.

proc mixed data=formixed;
class gender person;
model y = gender|age/solution;
repeated / type=un sub=person;
run;

gives me a similar result to

 gls( y ~ gender + age + gender*age , formixed, correlation =
corSymm(form = ~ 1 | person))

SAS output:

Intercept 15.84
gender 1.58
age 0.82
age*gender -.35

nlme output:

Coefficients:

(Intercept)      15.93
genderFemale      1.47
age               0.82
genderFemale:age -0.35



On Fri, Jan 27, 2012 at 12:03 AM, Kevin Wright <kw.stat at gmail.com> wrote:
> Dear Charles,
>
> First, I hope you are not put off by the tone of some the responses in this
> email thread.  Sometimes people have strong opinions and it might come
> across aggressively.
>
> Second, be sure to understand that reproducing a SAS analysis with lme in
> no way violates any legal agreements that SAS may have, if for no other
> reason than you never signed an agreement with SAS!  That bit in the EULA
> about decompiling and reverse engineering means that people are prohibited
> from creating a new version of PROC MIXED that does the same thing.  The
> nlme package uses different methods than SAS. E.g. different optimizers,
> even uses a log-parameterization deep in the code so that negative variance
> components cannot happen.
>
> Third, by now you've probably figured out that PROC MIXED and lme have very
> different ideas about degrees of freedom.  Also, the loglikelihoods are on
> different scales.  For that reason, when I try to reproduce an analysis, I
> find the best way to compare is to look at the variance components.
>
> Here is what PROC MIXED says:
>
>    Cov Parm Estimate     Std Error       Z  Pr > |Z|
> UN(1,1)     5.41545455    1.53172185    3.54    0.0004
> UN(2,1)     2.71681818    1.09623989    2.48    0.0132
> UN(2,2)     4.18477273    1.18363247    3.54    0.0004
> UN(3,1)     3.91022727    1.41775367    2.76    0.0058
> UN(3,2)     2.92715909    1.19304751    2.45    0.0141
> UN(3,3)     6.45573864    1.82595863    3.54    0.0004
> UN(4,1)     2.71022727    1.17209851    2.31    0.0208
> UN(4,2)     3.31715909    1.12903016    2.94    0.0033
> UN(4,3)     4.13073864    1.40356157    2.94    0.0033
> UN(4,4)     4.98573864    1.41017984    3.54    0.0004
> Residual         1.00000000 .       .         .
>
> That's our target, and here is the lme code to get us there.
>
> library(nlme)
> orth <- as.data.frame(Orthodont)
> m1 <- gls(distance ~ Sex*age,
>          correlation=corSymm(form = ~ 1 | Subject),
>          weights = varIdent(form = ~ 1 | age),
>          data = orth)
> cors <- corMatrix(m1$modelStruct$corStruct)[[1]]
> vars <- c(1.0000000, 0.8788793, 1.0744596, 0.9586878)^2 * 2.329213^2
> covs <- outer(vars,vars, function(x,y) sqrt(x)*sqrt(y))
> cors*covs
>
> Now, some explanations will surely help, so let's step through the code
> with comments.
>
> The Orthodont data is a groupedData object, which can be helpful, but
> sometimes confusing, so coercing to a data.frame removes the formula--you
> can see it with 'formula(Orthodont)'.
>
> orth <- as.data.frame(Orthodont)
>
> Since there is no "random" line in the PROC MIXED code, we don't want to
> use 'lme' (also why I removed the formula from the data), but instead use
> 'gls' for generalized least squares.  The 'corSymm' part specifies a
> symmetric correlation matrix with 1 on the diagonal.  Looking at the MIXED
> parameters, the results are given a covariances, not correlations.  Note
> how the variances UN(1,1), UN(2,2,), etc are different, not constant along
> the diagonal.  We use the 'weights' statement to specify different stratum
> variances.
>
> m1 <- gls(distance ~ Sex*age,
>          correlation=corSymm(form = ~ 1 | Subject),
>          weights = varIdent(form = ~ 1 | age),
>          data = orth)
>
> Now we have to convert the correlations and standard deviations of lme into
> covariance parameters of MIXED.
>
> First, extract the correlation matrix from lme.
>
> cors <- corMatrix(m1$modelStruct$corStruct)[[1]]
>
> Now, hard-code the stratum std deviations and square them to get
> variances.  Multiply by the square of the residual std dev.  (There's a way
> to extract all these values from the fitted model instead of hand-typing
> them, but I can't find them at the moment.)
>
> vars <- c(1.0000000, 0.8788793, 1.0744596, 0.9586878)^2 * 2.329213^2
>
> Now create a matrix of variances and covariances.
>
> covs <- outer(vars,vars, function(x,y) sqrt(x)*sqrt(y))
>
> Finally, multiply the correlations by the covariances.  Let's compare
> results:
>
> PROC MIXED:
>
> Row Col1 Col2 Col3 Col4
> 1 5.415 2.716 3.910 2.710
> 2 2.716 4.184 2.927 3.317
> 3 3.910 2.927 6.455 4.130
> 4 2.710 3.317 4.130 4.985
>
> R> round(cors*covs,3)
>      [,1]  [,2]  [,3]  [,4]
> [1,] 5.425 2.709 3.841 2.715
> [2,] 2.709 4.191 2.975 3.314
> [3,] 3.841 2.975 6.263 4.133
> [4,] 2.715 3.314 4.133 4.986
>
> In my experience, the small differences in the results are typical for
> unstructured models.
>
> Hope this helps.
>
> Kevin Wright
>
>
> On Tue, Jan 24, 2012 at 8:32 PM, Charles Determan Jr <deter088 at umn.edu>wrote:
>
>> Greetings,
>>
>> I have been working on R for some time now and I have begun the endeavor of
>> trying to replicate some SAS code in R.  I have scoured the forums but
>> haven't been able to find an answer.  I hope one of you could be so kind as
>> to enlighten me.
>>
>> I am attempting to replicate a repeated measures experiment using some
>> standard data.  I have posted the SAS code and output directly from a
>> publication as well as my attempts in R to replicate it.  My main issue
>> comes with the 'unstructured' component.
>>
>> The 'dental' dataset from 'mixedQF' package,
>> equivalent to formixed data in SAS
>>
>>    distance age Subject    Sex
>> 1       26.0   8     M01   Male
>> 2       25.0  10     M01   Male
>> 3       29.0  12     M01   Male
>> 4       31.0  14     M01   Male
>> 5       21.5   8     M02   Male
>> 6       22.5  10     M02   Male
>> 7       23.0  12     M02   Male
>> 8       26.5  14     M02   Male
>> 9       23.0   8     M03   Male
>> 10      22.5  10     M03   Male
>> 11      24.0  12     M03   Male
>> 12      27.5  14     M03   Male
>> 13      25.5   8     M04   Male
>> 14      27.5  10     M04   Male
>> 15      26.5  12     M04   Male
>> 16      27.0  14     M04   Male
>> 17      20.0   8     M05   Male
>> 18      23.5  10     M05   Male
>> 19      22.5  12     M05   Male
>> 20      26.0  14     M05   Male
>> 21      24.5   8     M06   Male
>> 22      25.5  10     M06   Male
>> 23      27.0  12     M06   Male
>> 24      28.5  14     M06   Male
>> 25      22.0   8     M07   Male
>> 26      22.0  10     M07   Male
>> 27      24.5  12     M07   Male
>> 28      26.5  14     M07   Male
>> 29      24.0   8     M08   Male
>> 30      21.5  10     M08   Male
>> 31      24.5  12     M08   Male
>> 32      25.5  14     M08   Male
>> 33      23.0   8     M09   Male
>> 34      20.5  10     M09   Male
>> 35      31.0  12     M09   Male
>> 36      26.0  14     M09   Male
>> 37      27.5   8     M10   Male
>> 38      28.0  10     M10   Male
>> 39      31.0  12     M10   Male
>> 40      31.5  14     M10   Male
>> 41      23.0   8     M11   Male
>> 42      23.0  10     M11   Male
>> 43      23.5  12     M11   Male
>> 44      25.0  14     M11   Male
>> 45      21.5   8     M12   Male
>> 46      23.5  10     M12   Male
>> 47      24.0  12     M12   Male
>> 48      28.0  14     M12   Male
>> 49      17.0   8     M13   Male
>> 50      24.5  10     M13   Male
>> 51      26.0  12     M13   Male
>> 52      29.5  14     M13   Male
>> 53      22.5   8     M14   Male
>> 54      25.5  10     M14   Male
>> 55      25.5  12     M14   Male
>> 56      26.0  14     M14   Male
>> 57      23.0   8     M15   Male
>> 58      24.5  10     M15   Male
>> 59      26.0  12     M15   Male
>> 60      30.0  14     M15   Male
>> 61      22.0   8     M16   Male
>> 62      21.5  10     M16   Male
>> 63      23.5  12     M16   Male
>> 64      25.0  14     M16   Male
>> 65      21.0   8     F01 Female
>> 66      20.0  10     F01 Female
>> 67      21.5  12     F01 Female
>> 68      23.0  14     F01 Female
>> 69      21.0   8     F02 Female
>> 70      21.5  10     F02 Female
>> 71      24.0  12     F02 Female
>> 72      25.5  14     F02 Female
>> 73      20.5   8     F03 Female
>> 74      24.0  10     F03 Female
>> 75      24.5  12     F03 Female
>> 76      26.0  14     F03 Female
>> 77      23.5   8     F04 Female
>> 78      24.5  10     F04 Female
>> 79      25.0  12     F04 Female
>> 80      26.5  14     F04 Female
>> 81      21.5   8     F05 Female
>> 82      23.0  10     F05 Female
>> 83      22.5  12     F05 Female
>> 84      23.5  14     F05 Female
>> 85      20.0   8     F06 Female
>> 86      21.0  10     F06 Female
>> 87      21.0  12     F06 Female
>> 88      22.5  14     F06 Female
>> 89      21.5   8     F07 Female
>> 90      22.5  10     F07 Female
>> 91      23.0  12     F07 Female
>> 92      25.0  14     F07 Female
>> 93      23.0   8     F08 Female
>> 94      23.0  10     F08 Female
>> 95      23.5  12     F08 Female
>> 96      24.0  14     F08 Female
>> 97      20.0   8     F09 Female
>> 98      21.0  10     F09 Female
>> 99      22.0  12     F09 Female
>> 100     21.5  14     F09 Female
>> 101     16.5   8     F10 Female
>> 102     19.0  10     F10 Female
>> 103     19.0  12     F10 Female
>> 104     19.5  14     F10 Female
>> 105     24.5   8     F11 Female
>> 106     25.0  10     F11 Female
>> 107     28.0  12     F11 Female
>> 108     28.0  14     F11 Female
>>
>> *Mixed modeling and fixed effect test*
>> SAS
>> proc mixed data=formixed;
>> class gender age person;
>> model y = gender|age;
>> repeated / type=cs sub=person;
>> run;
>>
>> output of interest to me
>>          Tests of Fixed Effects
>> Source             NDF   DDF    Type III F    Pr > F
>> GENDER           1        25        9.29        0.0054
>> AGE                  3        75       35.35       0.0001
>> GENDER*AGE   3        75        2.36        0.0781
>>
>> R (nlme package)
>> y=lme(distance~Sex*age, random=(~1|Subject), data=dental)
>> anova(y)
>>
>>            numDF denDF  F-value p-value
>> (Intercept)     1    75 4123.156  <.0001
>> Sex              1    25    9.292  0.0054
>> age               3    75   40.032  <.0001
>> Sex:age        3    75    2.362  0.0781
>>
>> Now this isn't exact but it is extremely close, however when I try to
>> replicate the unstructured,
>>
>> SAS
>> proc mixed data=formixed;
>> class gender age person;
>> model y = gender|age;
>> repeated / type=un sub=person;
>> run;
>>
>>             Tests of Fixed Effects
>> Source          NDF DDF Type III F Pr > F
>> GENDER         1    25     9.29    0.0054
>> AGE                3    25    34.45   0.0001
>> GENDER*AGE 3    25     2.93    0.0532
>>
>> R
>> either
>> y=lme(distance~Sex*age, random=(~1|Subject), corr=corSymm(,~1|Subject),
>> data=dental)
>> anova(y)
>> or
>> z=lme(distance~Sex*age, random=(~1|Subject), corr=corSymm(), data=dental)
>> anova(z)
>>
>> gives the output
>>
>>            numDF denDF  F-value    p-value
>> (Intercept)     1    75     4052.028  <.0001
>> Sex              1    25       8.462      0.0075
>> age               3    75      39.022    <.0001
>> Sex:age        3    75       2.868      0.0421
>>
>> What am I doing wrong to replicate the unstructured linear mixed model from
>> SAS?
>>
>> Regards,
>>
>> Charles
>>
>>        [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>
>
> --
> Kevin Wright
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models




More information about the R-sig-mixed-models mailing list