[R-sig-ME] Random vs. fixed effects

Christos Argyropoulos argchris at hotmail.com
Mon Apr 26 05:37:13 CEST 2010


Dave Fournier wrote ...

"It appears that the real problem here is not with the data, but with the
estimation procedure...."

This appears to be the case, but ADMB is not the only package that produces the "right" REML solution.
The "old" function (lme from package nlme) yields the "correct" answer and so does proc mixed (in SAS)

The results are summarized in the following table (the code and the R/SAS program output is attached at
the end of the email). 


Function              4 Level Factor           50 Level Factor        Truth

lmer(lme4)                1.0565                            3.3469                     4.0
lme(nlme)                  4.4215                             3.3221                     4.0
proc mixed           4.4214                             3.3221                      4.0
ADMB                            4.4214               NA               4.0


If I remember correctly lme(nlme) uses a few EM steps before switching to a Newton type of algorithm 
for the optimization (what about lmer?). It is sort of funny that the SAS help file on proc mixed downplays this
approach, yet the results are the same!

SAS Help file text:
('PROC MIXED uses a 
ridge-stabilized Newton-Raphson algorithm to optimize either a full (ML) or 
residual (REML) 
likelihood function. The Newton-Raphson algorithm is preferred 
to the EM algorithm (Lindstrom and Bates 1988) ')

If I find the time I will run the same example in WinBUGS to see what comes out of it.

The discussion of how to adjust for such effects (fixed v.s. random effects) has an interesting counterpart in survival analysis i.e. whether
center effects should be adjusted for by including them a in the model design matrix or by specifying a separate baseline hazard
for each of the centers (=levels of a factor). It all boils down to what one wants to achieve by such adjustment; if the factor effects are of inferential interest them I guess they should be included as fixed effects. If they are not of primary inferential interest, then including them as random effects 
(or baseline hazards in survival analysis) is the way to go.

Christos Argyropoulos

CODE


R side (to simulate the datasets)

 set.seed(1)
 n <- 10000
 k <- 4
 f <- function(n, k) {
 set.seed(1)
 x <- 1:n
 fac <- gl(k, 1, n)
 fac.eff <- rnorm(k, 0, 4)[fac]
 e <- rnorm(n)
 y <- 1 + 2 * x + fac.eff + e
data.frame(y=y,x=x,fac_eff=fac.eff,fac=fac)
}

d4<-f(10000,4)
d50<-f(10000,50)

write.dta(d4,"d4.dta")
write.dta(d50,"d50.dta")

(save the stata files somewhere handy e.g. in C:\Temp because you will need it to import the data
in SAS)
####################################################################
FACTOR WITH 4 LEVELS in R
####################################################################

> lme(y~x,random=~1|fac,data=d4)
Linear mixed-effects model fit by REML
  Data: d4
  Log-restricted-likelihood: -14342.06
  Fixed: y ~ x
(Intercept)           x
   1.313495    1.999999

Random effects:
 Formula: ~1 | fac
        (Intercept) Residual
StdDev:    4.421380 1.012295

Number of Observations: 10000
Number of Groups: 4

> lmer(y~x+(1|fac),data=d4)
Linear mixed model fit by REML
Formula: y ~ x + (1 | fac)
   Data: d4
   AIC   BIC logLik deviance REMLdev
 28733 28762 -14363    28702   28725
Random effects:
 Groups   Name        Variance Std.Dev.
 fac      (Intercept) 1.1162   1.0565
 Residual             1.0298   1.0148
Number of obs: 10000, groups: fac, 4

Fixed effects:
             Estimate Std. Error t value
(Intercept) 1.313e+00  5.286e-01       2
x           2.000e+00  3.515e-06  568923

Correlation of Fixed Effects:
  (Intr)
x -0.033
Warning message:
In mer_finalize(ans) : false convergence (8)


####################################################################
FACTOR WITH 50 LEVELS in R
####################################################################

> lme(y~x,random=~1|fac,data=d50)
Linear mixed-effects model fit by REML
  Data: d50
  Log-restricted-likelihood: -14515.87
  Fixed: y ~ x
(Intercept)           x
   1.396288    2.000000

Random effects:
 Formula: ~1 | fac
        (Intercept) Residual
StdDev:    3.322084  1.01249

Number of Observations: 10000
Number of Groups: 50
>

> lmer(y~x+(1|fac),data=d50)
Linear mixed model fit by REML
Formula: y ~ x + (1 | fac)
   Data: d50
   AIC   BIC logLik deviance REMLdev
 29040 29069 -14516    29009   29032
Random effects:
 Groups   Name        Variance Std.Dev.
 fac      (Intercept) 11.2016  3.3469
 Residual              1.0251  1.0125
Number of obs: 10000, groups: fac, 50

Fixed effects:
             Estimate Std. Error t value
(Intercept) 1.396e+00  4.738e-01       3
x           2.000e+00  3.507e-06  570242

Correlation of Fixed Effects:
  (Intr)
x -0.037
>


******************************************
SAS side
******************************************
PROC IMPORT OUT= WORK.d4
            DATAFILE= "C:\Temp\d4.dta"
            DBMS=DTA REPLACE;
RUN;

PROC IMPORT OUT= WORK.d50
            DATAFILE= "C:\Temp\d50.dta"
            DBMS=DTA REPLACE;
RUN;
proc sort data=work.d4 out=work.d4;
        by fac;
run;
proc sort data=work.d50 out=work.d50;
        by fac;
run;


proc mixed data=work.d4 method=REML;
   class fac;
   model y = x/s;
   random fac;
run;

proc mixed data=work.d50;
   class fac;
   model y = x/s;
   random fac;
run;


####################################################################
FACTOR WITH 4 LEVELS in SAS
####################################################################


                                    Convergence criteria met.

                                          The SAS System          22:12 Sunday, April 25, 2010  25

                                       The Mixed Procedure

                                      Covariance Parameter
                                            Estimates

                                      Cov Parm     Estimate

                                      FAC           11.0363
                                      Residual       1.0251


                                         Fit Statistics

                              -2 Res Log Likelihood         29031.7
                              AIC (smaller is better)       29035.7
                              AICC (smaller is better)      29035.7
                              BIC (smaller is better)       29039.5


                                    Solution for Fixed Effects

                                          Standard
                 Effect       Estimate       Error      DF    t Value    Pr> |t|

                 Intercept      1.3963      0.4703      49       2.97      0.0046
                 X              2.0000    3.507E-6    9949     570221      <.0001


                                  Type 3 Tests of Fixed Effects

                                        Num     Den
                          Effect         DF      DF    F Value    Pr> F

                          X               1    9949    3.25E11    <.0001


####################################################################
FACTOR WITH 50 LEVELS in SAS
####################################################################


                                          The SAS System          22:12 Sunday, April 25, 2010  25

                                       The Mixed Procedure

                                      Covariance Parameter
                                            Estimates

                                      Cov Parm     Estimate

                                      FAC           11.0363
                                      Residual       1.0251


                                         Fit Statistics

                              -2 Res Log Likelihood         29031.7
                              AIC (smaller is better)       29035.7
                              AICC (smaller is better)      29035.7
                              BIC (smaller is better)       29039.5


                                    Solution for Fixed Effects

                                          Standard
                 Effect       Estimate       Error      DF    t Value    Pr> |t|

                 Intercept      1.3963      0.4703      49       2.97      0.0046
                 X              2.0000    3.507E-6    9949     570221      <.0001


                                  Type 3 Tests of Fixed Effects

                                        Num     Den
                          Effect         DF      DF    F Value    Pr> F

                          X               1    9949    3.25E11    <.0001

 		 	   		  
_________________________________________________________________
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.




More information about the R-sig-mixed-models mailing list