[R-sig-ME] Help understanding residual variance
Joshua Wiley
jwiley.psych at gmail.com
Thu Mar 29 17:29:52 CEST 2012
Hi Ista,
To me the onis is on the statistician consultant to explain *why* you
cannot have both random intercepts and slopes. Does the consultant
have papers to reference or proofs?
In any case, this is hardly exclusive to 'R doing something strange'.
SAS and Stata happily join the gang. See the attached file for code
and output from all three using a minidataset simulated in R.
I suppose one could bicker over whether a random intercept and slope
is a good idea, but possible it certainly is. You might suggest that
it is poor fare to voice strong opinions about matters which one does
not understand.
Cheers,
Josh
On Thu, Mar 29, 2012 at 3:29 AM, Ista Zahn <istazahn at gmail.com> wrote:
> Hi Reinhold,
>
> Good question. My consultant didn't seem impressed when I tried to
> articulate that explanation, but perhaps I wasn't clear.
>
> Thanks,
> Ista
> On Thu, Mar 29, 2012 at 1:45 AM, Reinhold Kliegl
> <reinhold.kliegl at gmail.com> wrote:
>> But why is Greg Snow's response inadequate?
>>
>> Restating his argument: In an LMM we are not estimating individual
>> random effects (means, slopes) and individual residuals, but variance
>> of random effects and variance of residuals. So there can be
>> differences between a subject's observed random effect and random
>> slope and conditional modes of the distribution of the random effects
>> (i.e., the point of maximum density), given the observed data and
>> evaluated at the parameter estimates.
>>
>> I think your statistician's answer is a good argument that you must
>> not treat conditional modes as independent observations in a
>> subsequent analyses. For example, we showed with simulations that
>> correlations between conditional modes of slopes and intercepts are
>> larger than the correlation parameter estimated in the LMM (Kliegl,
>> Masson, & Richer, Visual Cognition, 2010).
>>
>> Reinhold Kliegl
>>
>> --
>> Reinhold Kliegl
>> http://read.psych.uni-potsdam.de/pmr2
>>
>> On Tue, Mar 27, 2012 at 4:18 AM, Ista Zahn <istazahn at gmail.com> wrote:
>>> Hi all,
>>>
>>> I'm trying to understand what the residual variance in this model:
>>>
>>> tmp <- subset(sleepstudy, Days == 1 | Days == 9)
>>> m1 <- lmer(Reaction ~ 1 + Days + (1 + Days | Subject), data = tmp)
>>> tmp$fitted1 <- fitted(m1)
>>>
>>> represents. The way I read this specification, an intercept and a
>>> slope is estimated for each subject. Since each subject only has two
>>> measurements, I would expect the Reaction scores to be completely
>>> accounted for by the slopes and intercepts. Yet they are not: the
>>> Residual variance estimate is 440.278.
>>>
>>> This is probably a stupid question, but I hope you will be kind enough
>>> to humor me.
>>>
>>> Best,
>>> Ista
>>>
>>> _______________________________________________
>>> R-sig-mixed-models at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/
-------------- next part --------------
############################################################
# R #
############################################################
set.seed(1)
d <- data.frame(y = c(y <- rnorm(100), y + rnorm(100)),
x = rep(0:1, each = 100), id = factor(rep(1:100, 2)))
d <- d[order(d$id), ]
require(lme4)
summary(lmer(y ~ 1 + (1 + x | id), data = d))
## > summary(lmer(y ~ 1 + (1 + x | id), data = d))
## Linear mixed model fit by REML
## Formula: y ~ 1 + (1 + x | id)
## Data: d
## AIC BIC logLik deviance REMLdev
## 548.6 565.1 -269.3 535.6 538.6
## Random effects:
## Groups Name Variance Std.Dev. Corr
## id (Intercept) 0.60384 0.77707
## x 0.50393 0.70988 0.366
## Residual 0.20293 0.45047
## Number of obs: 200, groups: id, 100
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.10885 0.08982 1.212
require(nlme)
summary(lme(y ~ 1, random = ~ 1 + x | id, data = d))
## > summary(lme(y ~ 1, random = ~ 1 + x | id, data = d))
## Linear mixed-effects model fit by REML
## Data: d
## AIC BIC logLik
## 548.6301 565.0967 -269.3151
## Random effects:
## Formula: ~1 + x | id
## Structure: General positive-definite, Log-Cholesky parametrization
## StdDev Corr
## (Intercept) 0.8231605 (Intr)
## x 0.8071236 0.193
## Residual 0.3594008
## Fixed effects: y ~ 1
## Value Std.Error DF t-value p-value
## (Intercept) 0.1088521 0.08981989 100 1.211893 0.2284
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -1.18533542 -0.26837701 -0.03513932 0.29186404 1.25726715
## Number of Observations: 200
## Number of Groups: 100
require(foreign)
write.dta(d, file = "d:/d.dta")
write.csv(d, file = "d:/d.csv", row.names = FALSE)
############################################################
# SAS #
############################################################
# options nocenter nolabel nodate formchar="|----|+|---+=|-/\<>";
# PROC IMPORT OUT= WORK.d
# DATAFILE= "D:\d.csv"
# DBMS=CSV REPLACE;
# GETNAMES=YES;
# DATAROW=2;
# RUN;
#
# proc mixed data=d method=reml noclprint;
# class id;
# model y = / solution;
# random int x / subject=id type=un;
# run;
## The Mixed Procedure
## Model Information
## Data Set WORK.D
## Dependent Variable y
## Covariance Structure Unstructured
## Subject Effect id
## Estimation Method REML
## Residual Variance Method Profile
## Fixed Effects SE Method Model-Based
## Degrees of Freedom Method Containment
## Dimensions
## Covariance Parameters 4
## Columns in X 1
## Columns in Z Per Subject 2
## Subjects 100
## Max Obs Per Subject 2
## Number of Observations
## Number of Observations Read 200
## Number of Observations Used 200
## Number of Observations Not Used 0
## Iteration History
## Iteration Evaluations -2 Res Log Like Criterion
## 0 1 615.81799232
## 1 4 545.16825286 0.05801788
## 2 1 539.21410924 0.00597671
## 3 1 538.64539097 0.00017233
## 4 1 538.63016700 0.00000020
## 5 1 538.63014985 0.00000000
## Convergence criteria met.
## Covariance Parameter Estimates
## Cov Parm Subject Estimate
## UN(1,1) id 0.3519
## UN(2,1) id 0.4540
## UN(2,2) id 0
## Residual 0.4549
## Fit Statistics
## -2 Res Log Likelihood 538.6
## AIC (smaller is better) 544.6
## AICC (smaller is better) 544.8
## BIC (smaller is better) 552.4
## Null Model Likelihood Ratio Test
## DF Chi-Square Pr > ChiSq
## 2 77.19 <.0001
## Solution for Fixed Effects
## Standard
## Effect Estimate Error DF t Value Pr > |t|
## Intercept 0.1089 0.08982 99 1.21 0.2284
############################################################
# Stata #
############################################################
# use "d:/d.dta", clear
# xtmixed y || id: x, cov(un) reml
# di -2*e(ll)
## . use "d:/d.dta", clear
## (Written by R. )
## . xtmixed y || id: x, cov(un) reml
## Performing EM optimization:
## Performing gradient-based optimization:
## Iteration 0: log restricted-likelihood = -269.31507
## Iteration 1: log restricted-likelihood = -269.31507 (backed up)
## Computing standard errors:
## Mixed-effects REML regression Number of obs = 200
## Group variable: id Number of groups = 100
## Obs per group: min = 2
## avg = 2.0
## max = 2
## Wald chi2(0) = .
## Log restricted-likelihood = -269.31507 Prob > chi2 = .
## ------------------------------------------------------------------------------
## y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
## -------------+----------------------------------------------------------------
## _cons | .1088521 .0898198 1.21 0.226 -.0671914 .2848956
## ------------------------------------------------------------------------------
## ------------------------------------------------------------------------------
## Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval]
## -----------------------------+------------------------------------------------
## id: Unstructured |
## sd(x) | .8071714 31.05985 1.42e-33 4.58e+32
## sd(_cons) | .8231815 15.22811 1.48e-16 4.59e+15
## corr(x,_cons) | .1930683 48.73245 -1 1
## -----------------------------+------------------------------------------------
## sd(Residual) | .3593491 34.8834 8.44e-84 1.53e+82
## ------------------------------------------------------------------------------
## LR test vs. linear regression: chi2(3) = 77.19 Prob > chi2 = 0.0000
## Note: LR test is conservative and provided only for reference.
## . di -2*e(ll)
## 538.63015
More information about the R-sig-mixed-models
mailing list