[R-sig-ME] Random vs. fixed effects
Christos Argyropoulos
argchris at hotmail.com
Mon Apr 26 05:37:13 CEST 2010
Dave Fournier wrote ...
"It appears that the real problem here is not with the data, but with the
estimation procedure...."
This appears to be the case, but ADMB is not the only package that produces the "right" REML solution.
The "old" function (lme from package nlme) yields the "correct" answer and so does proc mixed (in SAS)
The results are summarized in the following table (the code and the R/SAS program output is attached at
the end of the email).
Function 4 Level Factor 50 Level Factor Truth
lmer(lme4) 1.0565 3.3469 4.0
lme(nlme) 4.4215 3.3221 4.0
proc mixed 4.4214 3.3221 4.0
ADMB 4.4214 NA 4.0
If I remember correctly lme(nlme) uses a few EM steps before switching to a Newton type of algorithm
for the optimization (what about lmer?). It is sort of funny that the SAS help file on proc mixed downplays this
approach, yet the results are the same!
SAS Help file text:
('PROC MIXED uses a
ridge-stabilized Newton-Raphson algorithm to optimize either a full (ML) or
residual (REML)
likelihood function. The Newton-Raphson algorithm is preferred
to the EM algorithm (Lindstrom and Bates 1988) ')
If I find the time I will run the same example in WinBUGS to see what comes out of it.
The discussion of how to adjust for such effects (fixed v.s. random effects) has an interesting counterpart in survival analysis i.e. whether
center effects should be adjusted for by including them a in the model design matrix or by specifying a separate baseline hazard
for each of the centers (=levels of a factor). It all boils down to what one wants to achieve by such adjustment; if the factor effects are of inferential interest them I guess they should be included as fixed effects. If they are not of primary inferential interest, then including them as random effects
(or baseline hazards in survival analysis) is the way to go.
Christos Argyropoulos
CODE
R side (to simulate the datasets)
set.seed(1)
n <- 10000
k <- 4
f <- function(n, k) {
set.seed(1)
x <- 1:n
fac <- gl(k, 1, n)
fac.eff <- rnorm(k, 0, 4)[fac]
e <- rnorm(n)
y <- 1 + 2 * x + fac.eff + e
data.frame(y=y,x=x,fac_eff=fac.eff,fac=fac)
}
d4<-f(10000,4)
d50<-f(10000,50)
write.dta(d4,"d4.dta")
write.dta(d50,"d50.dta")
(save the stata files somewhere handy e.g. in C:\Temp because you will need it to import the data
in SAS)
####################################################################
FACTOR WITH 4 LEVELS in R
####################################################################
> lme(y~x,random=~1|fac,data=d4)
Linear mixed-effects model fit by REML
Data: d4
Log-restricted-likelihood: -14342.06
Fixed: y ~ x
(Intercept) x
1.313495 1.999999
Random effects:
Formula: ~1 | fac
(Intercept) Residual
StdDev: 4.421380 1.012295
Number of Observations: 10000
Number of Groups: 4
> lmer(y~x+(1|fac),data=d4)
Linear mixed model fit by REML
Formula: y ~ x + (1 | fac)
Data: d4
AIC BIC logLik deviance REMLdev
28733 28762 -14363 28702 28725
Random effects:
Groups Name Variance Std.Dev.
fac (Intercept) 1.1162 1.0565
Residual 1.0298 1.0148
Number of obs: 10000, groups: fac, 4
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.313e+00 5.286e-01 2
x 2.000e+00 3.515e-06 568923
Correlation of Fixed Effects:
(Intr)
x -0.033
Warning message:
In mer_finalize(ans) : false convergence (8)
####################################################################
FACTOR WITH 50 LEVELS in R
####################################################################
> lme(y~x,random=~1|fac,data=d50)
Linear mixed-effects model fit by REML
Data: d50
Log-restricted-likelihood: -14515.87
Fixed: y ~ x
(Intercept) x
1.396288 2.000000
Random effects:
Formula: ~1 | fac
(Intercept) Residual
StdDev: 3.322084 1.01249
Number of Observations: 10000
Number of Groups: 50
>
> lmer(y~x+(1|fac),data=d50)
Linear mixed model fit by REML
Formula: y ~ x + (1 | fac)
Data: d50
AIC BIC logLik deviance REMLdev
29040 29069 -14516 29009 29032
Random effects:
Groups Name Variance Std.Dev.
fac (Intercept) 11.2016 3.3469
Residual 1.0251 1.0125
Number of obs: 10000, groups: fac, 50
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.396e+00 4.738e-01 3
x 2.000e+00 3.507e-06 570242
Correlation of Fixed Effects:
(Intr)
x -0.037
>
******************************************
SAS side
******************************************
PROC IMPORT OUT= WORK.d4
DATAFILE= "C:\Temp\d4.dta"
DBMS=DTA REPLACE;
RUN;
PROC IMPORT OUT= WORK.d50
DATAFILE= "C:\Temp\d50.dta"
DBMS=DTA REPLACE;
RUN;
proc sort data=work.d4 out=work.d4;
by fac;
run;
proc sort data=work.d50 out=work.d50;
by fac;
run;
proc mixed data=work.d4 method=REML;
class fac;
model y = x/s;
random fac;
run;
proc mixed data=work.d50;
class fac;
model y = x/s;
random fac;
run;
####################################################################
FACTOR WITH 4 LEVELS in SAS
####################################################################
Convergence criteria met.
The SAS System 22:12 Sunday, April 25, 2010 25
The Mixed Procedure
Covariance Parameter
Estimates
Cov Parm Estimate
FAC 11.0363
Residual 1.0251
Fit Statistics
-2 Res Log Likelihood 29031.7
AIC (smaller is better) 29035.7
AICC (smaller is better) 29035.7
BIC (smaller is better) 29039.5
Solution for Fixed Effects
Standard
Effect Estimate Error DF t Value Pr> |t|
Intercept 1.3963 0.4703 49 2.97 0.0046
X 2.0000 3.507E-6 9949 570221 <.0001
Type 3 Tests of Fixed Effects
Num Den
Effect DF DF F Value Pr> F
X 1 9949 3.25E11 <.0001
####################################################################
FACTOR WITH 50 LEVELS in SAS
####################################################################
The SAS System 22:12 Sunday, April 25, 2010 25
The Mixed Procedure
Covariance Parameter
Estimates
Cov Parm Estimate
FAC 11.0363
Residual 1.0251
Fit Statistics
-2 Res Log Likelihood 29031.7
AIC (smaller is better) 29035.7
AICC (smaller is better) 29035.7
BIC (smaller is better) 29039.5
Solution for Fixed Effects
Standard
Effect Estimate Error DF t Value Pr> |t|
Intercept 1.3963 0.4703 49 2.97 0.0046
X 2.0000 3.507E-6 9949 570221 <.0001
Type 3 Tests of Fixed Effects
Num Den
Effect DF DF F Value Pr> F
X 1 9949 3.25E11 <.0001
_________________________________________________________________
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
More information about the R-sig-mixed-models
mailing list