[R-sig-ME] Modeling variance heterogeneity in lme4 + meaning of VarCorr values

Mon Oct 20 14:45:13 CEST 2014

Hello,

I am currently working on estimating precision of artery diameter by
an imaging technique. Every 5 mm along one artery, diameter is
measured. This was done 10 times by the same reader, and another 10
times by a second reader, both on the same image. The aim is to have
an idea of the standard deviation of the diameter.

At first glance, it seems to be much higher at both ends of the
artery, probably because diameter changes quickly here hence a very
small shift in the origin may have great influence on the diameter ---
imagine a king of >----< shape.

To test this hypothesis, I built the following models in lme4, where
« Diametre » is the diameter, « Lecteur » the reader, « Lecture » the
reading and « Position » the position along the artery (as a factor);
d is the data.frame with the data:

# Single variance model
md.simple <- lmer( Diametre ~ Lecteur + Position + ( 1|Lecture ),
                   data = d, REML = FALSE )

# Different variance for each position model
#  1) building indicator variables for each position
for ( i in 0:7 ) {
    d[ , paste0( 'I.', i ) ] <- ifelse( d$Position == i, 1, 0 )
}

# 2) fit the model
md.tout   <- lmer( as.formula( paste( "Diametre ~ Lecteur + Position +",
                                      paste0( "(0+I.", 0:7,"|Lecture)",
                                      collapse = "+" ) ) ),
                   data = d, REML = FALSE )

My questions are

1) does it seem a pertinent model to really have a different variance
at each position? 

2) I guess it may be partly undetermined, since variance at position
is is the sum of the residual variance sigma² and the grouping term (
I.i|Lecture ) variance si², hence if variance at position i is for
instance 1 and is the smallest of all variances, we can have si² = 1,
sigma²=0, the otherr way round and any other combination such that the
sum is 1. Is it right?

   lmer seems to fit the model with several si² = 0, which seems to
mean that sigma² gets the "baseline" variance and terms are used only
if local variance is « much » higher, is it true?

   It gives a warning « In checkZrank(reTrms$Zt, n = n, control,
nonSmall = 1e+06) : number of observations <= rank(Z);
variance-covariance matrix will be unidentifiable », I guess this is
the result of unidentifiability. How far does it affects the results,
especially fixed effects estimates [« Reader » effect] and further
model comparisons?

2) When I try to reconstruct local variances using sigma( md.tout )
and VarCorr( md.tout ), to compare them to variances obtained by
tapply( d$Diametre, d$Position, var ), I've noticed that actual
standard deviations at position i are in attributes stddev of VarCorr.

This is indeed said in the documentation; however, I did not found
what are the actual values stocked directly in VarCorr list. I guess
it corresponds to variances in the rescaled model? If it is so, how is
determined the rescaling parameter? If not, can anyone explain or give
some reference to a technical doc what are these values, and the link
with actual variances?

Thanks in advance,

-- 
                                Emmanuel CURIS
                                emmanuel.curis at parisdescartes.fr

Page WWW: http://emmanuel.curis.online.fr/index.html