[R-sig-ME] gls vs lme covariance structures

Charles Determan Jr deter088 at umn.edu
Tue May 8 15:42:17 CEST 2012


Thanks for your assistance Joshua, I really appreciate you weighing in
on this.  The autoregressive results are just fine; however, the
unstructured results are still very different between the SAS UN
structure and R corSymm().  Any further thoughts as to why this may be
or how it may be addressed?

Regards,
Charles

On Sat, May 5, 2012 at 10:05 AM, Joshua Wiley <jwiley.psych at gmail.com> wrote:
> Hi Charles,
>
> I know you were not trying to make it difficult, no worries.  Most of
> the results seem to match fairly closely.  Here is some stuff you can
> try if you want (also added some residual plots and examination of the
> distribution of random effects which is assumed gaussian).
>
> Cheers,
>
> Josh
>
> ####################################################
> dat=read.table("subset.csv",sep=",",header=TRUE, na.strings=".")
> dat34=dat[dat$group %in% c("3", "4"),]
> dat34=within(dat34, {
>        group=factor(group)
>        Event_name=factor(Event_name)
>        Died=factor(Died)
>        ID=factor(ID)
> })
> contrasts(dat34$Event_name)=contr.sum(n=6)
> contrasts(dat34$group)=contr.sum(n=2)
> contrasts(dat34$Died)=contr.sum(n=2)
> dat34$rep <- unlist(sapply(rle(as.numeric(dat34$ID))$lengths, seq.int))
>
> require(nlme)
> require(lattice)
>
> m.run <- lme(var ~ group + Event_name + Died,
>  random = ~ 1 | ID, corr = corSymm(), data = dat34)
> ## store residuals and plot
> dat34$r.mrun <- resid(m.run)
> xyplot(r.mrun ~ rep | ID, data = dat34)
> ## examine distribution of random effects
> qqmath(ranef(m.run)[[1]])
> summary(m.run)
> ## Linear mixed-effects model fit by REML
> ##  Data: dat34
> ##        AIC      BIC   logLik
> ##   47.06365 116.5417 1.468174
>
> ## Random effects:
> ##  Formula: ~1 | ID
> ##         (Intercept)  Residual
> ## StdDev:   0.3934213 0.2635925
>
> ## Correlation Structure: General
> ##  Formula: ~1 | ID
> ##  Parameter estimate(s):
> ##  Correlation:  <-- you can see the unstructure correlation matrix
> ##   1      2      3      4      5
> ## 2  0.289
> ## 3  0.553  0.774
> ## 4  0.225  0.532  0.752
> ## 5  0.532  0.635  0.844  0.662
> ## 6  0.717 -0.052  0.361 -0.053  0.535
>
> anova(m.run, type = "marginal", adjustSigma = FALSE)
> ##             numDF denDF  F-value p-value
> ## (Intercept)     1    96 9.816408  0.0023
> ## group           1    23 0.131949  0.7197
> ## Event_name      5    96 1.081781  0.3754
> ## Died            1    23 0.074427  0.7874
>
>
> #Output should be: (SAS mixed procedure with UN)
> #Type 3 Tests of Fixed Effects
> #Effect        NumDF DenDF  F Value    Pr > F
> #Pig_group         1    23     2.73    0.1120
> #Event_name        5    23     1.11    0.3806
> #Died              1    23     0.51    0.4833
>
>
> ## group*Event_name expands to: group + Event_name + group:Event_name
> ## i.e., main effects + interaction, like "group | Event_name in proc mixed
> m.run2 <- lme(var2 ~ group*Event_name + Died,
>  random = ~ 1 | ID, corr = corSymm(), data = dat34)
> dat34$r.run2 <- resid(m.run2)
> xyplot(r.run2 ~ rep | ID, data = dat34)
> ## examine distribution of random effects
> qqmath(ranef(m.run2)[[1]])
> summary(m.run2)
>
> ## auto regressive model
> m.rar1 <- lme(var2 ~ group*Event_name + Died,
>  random = ~ 1 | ID, corr = corAR1(), data = dat34)
> ## store residuals and plot
> dat34$r.rar1 <- resid(m.rar1)
> xyplot(r.rar1 ~ rep | ID, data = dat34)
> ## examine distribution of random effects
> qqmath(ranef(m.rar1)[[1]])
> summary(m.rar1)
> ## Linear mixed-effects model fit by REML
> ##  Data: dat34
> ##        AIC     BIC    logLik
> ##   413.8188 457.598 -190.9094
>
> ## Random effects:
> ##  Formula: ~1 | ID
> ##         (Intercept) Residual
> ## StdDev:    0.383367 1.004379
>
> ## Correlation Structure: AR(1)
> ##  Formula: ~1 | ID
> ##  Parameter estimate(s):
> ##       Phi
> ## 0.1092874 <-- phi (rho) autocorrelation estimate
>
> anova(m.rar1, type = "marginal", adjustSigma = FALSE)
> ##                  numDF denDF  F-value p-value
> ## (Intercept)          1    91 337.2176  <.0001
> ## group                1    23   0.9015  0.3523
> ## Event_name           5    91  15.9816  <.0001
> ## Died                 1    23   1.4741  0.2370
> ## group:Event_name     5    91   3.4790  0.0064
>
> ## Output should be:
> ## Type 3 Tests of Fixed Effects (SAS mixed procedure with AR(1))
> ## Effect                 NumDF   DenDF F Value  Pr > F
> ## Pig_group                  1      23    0.99  0.3293
> ## Event_name                 5      91   16.23  <.0001
> ## Died                       1      23 1.70 0.2047
> ## Pig_group*Event_name       5      91    3.04  0.0140
>
>
> ## compare auto regressive with unstructured
> anova(m.run2, m.rar1)
> ####################################################
>
>
> On Fri, May 4, 2012 at 6:00 AM, Charles Determan Jr <deter088 at umn.edu> wrote:
>> Joshua,
>>
>> My apologies, I never intended to make it more difficult for people to
>> help.  I only wish to be very careful when I every work with data.  I
>> have created a subset as you requested in the attached csv file.  The
>> two covariance structures I wish to do with a mixed model are (UN) for
>> var, and (AR(1)) for var2.
>>
>> dat=read.table("C:/…/subset.csv",sep=",",header=TRUE, na.strings=".")
>> attach(dat)
>>
>> dat34=dat[Pig_group %in% c("3", "4"),]
>> attach(dat34)
>> dat34=within(dat34, {
>>        group=factor(group)
>>        Event_name=factor(Event_name)
>>        Died=factor(Died)
>>        ID=factor(ID)
>> })
>> attach(dat34)
>>
>> contrasts(dat34$Event_name)=contr.sum(n=6)
>> contrasts(dat34$group)=contr.sum(n=2)
>> contrasts(dat34$Died)=contr.sum(n=2)
>>
>>
>> The models I wish to fit with a mixed model are:
>> UN
>> var~group+Event_name+Died,
>> random=~1|ID
>> corr=corSymm()?
>> anova(fit, type=”marginal”, adjustSigma=F)
>>
>> #Output should be: (SAS mixed procedure with UN)
>> Type 3 Tests of Fixed Effects
>> Effect        NumDF DenDF F Value Pr > F
>> Pig_group          1    23      2.73    0.1120
>> Event_name       5      23      1.11    0.3806
>> Died                  1         23      0.51    0.4833
>>
>> AR(1)
>> var2~group+Event_name+Died+group*Event_name
>> random=~1|ID
>> corr=corAR1()?
>> anova(fit, type=”marginal”, adjustSigma=F)
>>
>> Output should be:
>> Type 3 Tests of Fixed Effects (SAS mixed procedure with AR(1))
>> Effect                          NumDF     DenDF F Value Pr > F
>> Pig_group                              1             23    0.99      0.3293
>> Event_name                           5               91    16.23    <.0001
>> Died                                      1                  23    1.70      0.2047
>> Pig_group*Event_name           5             91    3.04     0.0140
>>
>>
>> Many thanks, I truly appreciate the guidance,
>> Charles
>>
>> On Thu, May 3, 2012 at 3:59 PM, Joshua Wiley <jwiley.psych at gmail.com> wrote:
>>>
>>> Hi Charles,
>>>
>>> Well you could post a subset of it, or make up some data that is
>>> sharable (whether the data make any sense is not important to us, just
>>> nice to have runable code, for example your previous thread about
>>> contrasts could have been solved in one email if we could have shown
>>> you how to set the contrasts on your data and then it matched your SAS
>>> output).  In any case, whether you use lme or gls really depends on
>>> your question and goals, I think.  Generalized least squares is not
>>> the same as a random effects model.  If you want a random effect, you
>>> cannot use gls.  If you just want correlated errors, gls is fine.
>>>
>>> This part of your code strikes me as atypical though I cannot promise
>>> it is wrong/not what you want: corr=corAR1(ID)
>>>
>>> Cheers,
>>>
>>> Josh
>>>
>>>
>>> On Thu, May 3, 2012 at 1:44 PM, Charles Determan Jr <deter088 at umn.edu> wrote:
>>> > Hi Joshua,
>>> >
>>> > Thanks for your response.  It is probably best that I don't post the data as
>>> > some of it is not yet published.  My main question is whether UN and AR(1)
>>> > should be done with gls or if I have done the syntax incorrectly with lme.
>>> > Since AR(1) is replicated perfectly if I put the correct dendf, I can work
>>> > with it.  And UN is close, so I just want to be sure my use and syntax are
>>> > correct, not necessarily modifying the data.
>>> >
>>> > Regards,
>>> > Charles
>>> >
>>> >
>>> > On Thu, May 3, 2012 at 3:21 PM, Joshua Wiley <jwiley.psych at gmail.com> wrote:
>>> >>
>>> >> Hi Charles,
>>> >>
>>> >> Could you upload the dataset you are using somewhere and post the
>>> >> link?  Something like:
>>> >>
>>> >> ##########
>>> >> dat34 <- read.csv("http:/wherever/you/uploaded/yourdata.csv", header =
>>> >> TRUE)
>>> >> ## code to convert to factors anything that needs to be etc.
>>> >> ##########
>>> >>
>>> >> Then it is easier for us to try things that way.
>>> >>
>>> >> corAR1 and corSymm seem appropriate.  Have you checked the examples in
>>> >> their documents?  I found them helpful.
>>> >>
>>> >> Cheers,
>>> >>
>>> >> Josh
>>> >>
>>> >> On Thu, May 3, 2012 at 12:58 PM, Charles Determan Jr <deter088 at umn.edu>
>>> >> wrote:
>>> >> > Greetings R users,
>>> >> >
>>> >> > I have been attempting to replicate various covariance structures from
>>> >> > SAS's analysis of mixed models.  I have successfully been able to
>>> >> > replicate
>>> >> > compound symmetry, however it becomes confusing with autoregression and
>>> >> > unstructured.  As such, there are two questions regarding this issue.
>>> >> >
>>> >> > Autoregression
>>> >> > SAS output (Type III fixed effects) for covariance structure AR(1)
>>> >> >
>>> >> > *Type 3 Tests of Fixed Effects*
>>> >> >
>>> >> > *Effect                    NumDF DenDF F Value Pr > F*
>>> >> >
>>> >> > *group                          *1         23          0.99      0.3293
>>> >> >
>>> >> > *Event_name               *5         91         16.23    <.0001
>>> >> >
>>> >> > *Died                            *1         23          1.70      0.2047
>>> >> >
>>> >> > *group*Event_name  *5        91          3.04     0.0140
>>> >> >
>>> >> > R output (corAR1=AR(1)?)
>>> >> > I can replicate these results if I run the following:
>>> >> >
>>> >> > fit.18=gls(var~group+Event_name+Died+group*Event_name,
>>> >> >    data=dat34,
>>> >> >    corr=corAR1(, ~1|ID),
>>> >> >    weight=varIdent(~1|Event_name))
>>> >> > anova(fit.18, type="marginal", adjustSigma=F)
>>> >> >
>>> >> > #the DenDF are off with gls, so use the 'correct' ones
>>> >> > 1-pf(.9935, 1, 23)
>>> >> > 1-pf(16.2323, 5, 91)
>>> >> > 1-pf(1.7041, 1, 23)
>>> >> > 1-pf(3.0367, 5, 91)
>>> >> > #and the output matches exactly
>>> >> >
>>> >> > However, I can not get the lme function to run the autoregression.  The
>>> >> > output is very different:
>>> >> >
>>> >> > fit.11=lme(var~group+Event_name+Died+group*Event_name,
>>> >> >    data=dat34,
>>> >> >    random=~1|ID,
>>> >> >    corr=corAR1(ID),
>>> >> >    weight=varIdent(~1|Event_name))
>>> >> > anova.lme(fit.11, type="marginal", adjustSigma=F)
>>> >> >
>>> >> >                       numDF denDF  F-value      p-value
>>> >> > (Intercept)               1    96        9.816419  0.0023
>>> >> > group                      1    23       0.131950  0.7197
>>> >> > Event_name            5    96        1.081785  0.3754
>>> >> > Died                       1    23        0.074428  0.7874
>>> >> >
>>> >> > Is this type of covariance structure only done with gls and I should
>>> >> > continue with the analysis as such or am I doing something silly with
>>> >> > lme?
>>> >> >
>>> >> > My second question is with regards to the unstructured covariance.
>>> >> > SAS output (UN)
>>> >> >
>>> >> > *Type 3 Tests of Fixed Effects*
>>> >> >
>>> >> > *Effect        NumDF DenDF F Value Pr > F*
>>> >> >
>>> >> > *Pig_group    *1            23         2.73    0.1120
>>> >> >
>>> >> > *Event_name *5            23         1.11    0.3806
>>> >> >
>>> >> > *Died              *1            23         0.51    0.4833
>>> >> >
>>> >> > R output (corSymm = UN?)
>>> >> > fit.11=lme(var2~group+Event_name+Died,
>>> >> >    data=dat34,
>>> >> >    random=~1|ID,
>>> >> >    corr=corSymm(, ~1|ID),
>>> >> >    weight=varIdent(~1|Event_name))
>>> >> > anova.lme(fit.11, type="marginal", adjustSigma=F)
>>> >> >
>>> >> > #same as corAR1???
>>> >> >                       numDF denDF  F-value      p-value
>>> >> > (Intercept)               1    96        9.816419  0.0023
>>> >> > group                      1    23       0.131950  0.7197
>>> >> > Event_name            5    96        1.081785  0.3754
>>> >> > Died                       1    23        0.074428  0.7874
>>> >> >
>>> >> > but with gls
>>> >> > fit.18=gls(var~group+Event_name+Died,
>>> >> >    data=dat34,
>>> >> >    corr=corSymm(~1|ID),
>>> >> >    weight=varIdent(~1|Event_name))
>>> >> > anova(fit.18, type="marginal", adjustSigma=F)
>>> >> >
>>> >> > 1-pf(2.869837, 1, 23)
>>> >> > 1-pf(1.126747, 5, 23)
>>> >> > 1-pf(.514726, 1, 23)
>>> >> >
>>> >> > [1] 0.1037549
>>> >> > [1] 0.3742309
>>> >> > [1] 0.4803239
>>> >> >
>>> >> > #close but not exact (however I can work with this if it is indeed
>>> >> > correct)
>>> >> >
>>> >> > Overall, I want to clarify the difference between gls and lme and if I
>>> >> > am
>>> >> > simply making some weird syntax error with lme that I can't seem to get
>>> >> > the
>>> >> > covariance structures to match.
>>> >> >
>>> >> >
>>> >> > Apologies for lots of information in one go, but hopefully this provides
>>> >> > necessary information to point me in the correct direction.
>>> >> >
>>> >> > Thanks to any and all who give their time to these questions,
>>> >> >
>>> >> > Regards,
>>> >> > Charles
>>> >> >
>>> >> >        [[alternative HTML version deleted]]
>>> >> >
>>> >> > _______________________________________________
>>> >> > R-sig-mixed-models at r-project.org mailing list
>>> >> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Joshua Wiley
>>> >> Ph.D. Student, Health Psychology
>>> >> Programmer Analyst II, Statistical Consulting Group
>>> >> University of California, Los Angeles
>>> >> https://joshuawiley.com/
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Joshua Wiley
>>> Ph.D. Student, Health Psychology
>>> Programmer Analyst II, Statistical Consulting Group
>>> University of California, Los Angeles
>>> https://joshuawiley.com/
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> Programmer Analyst II, Statistical Consulting Group
> University of California, Los Angeles
> https://joshuawiley.com/



More information about the R-sig-mixed-models mailing list