[R-sig-ME] Fw: mcmcsamp(lme4): What is contained in $ST and $sigma?

Tue Oct 7 23:35:01 CEST 2008

On Fri, Oct 3, 2008 at 1:55 PM, David LeBlond <David.LeBlond at abbott.com> wrote:
> Dear List,
>
> I am new to lme4. I believe it will be very useful for analysis of
> pharmaceutical stability data in which batches are considered to have
> random slopes and intercepts. We now use SAS Proc Mixed for this, but I
> like the lme4 approach and the other capabilities of R better.
>
> What attracts me to lme4 is the possibility of obtaining a posterior
> sample using mcmcsamp(). How else can one obtain meaningful interval
> estimates or make predictions about future data?
>
> However, I do not know how to interpret the contents of the mcmcsamp
> output slots $ST and $sigma. I find that the samples contained in those
> slots do not agree with the estimates provided by lmer(). I hope someone
> more knowledgeable will be willing to explain this to me.
>
> To make things concrete, I provide below an example analysis of simulated
> data.
>
> The following code generates 50 batches worth of stability tests. Batch
> intercepts and slopes are drawn from N(100,,4^2) and N(-1,2^2). Analytical
> results at each time point (0,3,...,36 months) are drawn from
> N(intercept+slope*month, 1^2). So there are 3 random paramters (variance
> for slope, intercept, and error) and 2 fixed effects (mean slope and
> intercept). With such a large balanced data set, I expect rapid
> convergence of the Gibbs sampler.
>
> B<-50 # number of batches
> Months<-c(0,3,6,9,12,24,30,36) # pull points
> n.Months<-length(Months)
> Init.mean<-100 # initial level for the process
> Init.SD<-4 # SD of the process initial
> Slope.mean<- -1
> Slope.SD<- 2
> Error.SD<- 1
> # Prepare Data Vectors
> Batch<-rep(NA,B*n.Months)
> Month<-rep(NA,B*n.Months)
> Y<-rep(NA,B*n.Months)
> for(i in 1:B){
>  slope<-rnorm(1,Slope.mean,Slope.SD)
>  intercept<-rnorm(1,Init.mean,Init.SD)
>  for(j in 1:n.Months){
>    k<-(i-1)*n.Months+j
>    Batch[k]<-i
>    Month[k]<-Months[j]
>    Y[k]<-intercept+Month[k]*slope+rnorm(1,0,Error.SD)
>  }
> }
> Stab<-cbind(Batch=factor(Batch),Month=Month,Y=Y);Stab
>
> I use the following to analyze these data:
>
> library(lme4)
> anal<-lmer(Y~1+Month+(1|Batch)+(-1+Month|Batch))
> anal
>
> The following output is obtained:
>
> Linear mixed model fit by REML
> Formula: Y ~ 1 + Month + (1 | Batch) + (-1 + Month | Batch)
>  AIC  BIC logLik deviance REMLdev
>  1821 1841 -905.4     1811    1811
> Random effects:
>  Groups   Name        Variance Std.Dev.
>  Batch    (Intercept) 19.5551  4.42212
>  Batch    Month        4.4818  2.11704
>  Residual              0.9780  0.98894
> Number of obs: 400, groups: Batch, 50
>
> Fixed effects:
>            Estimate Std. Error t value
> (Intercept) 100.0254     0.6302  158.73
> Month        -1.3988     0.2994   -4.67
>
> Correlation of Fixed Effects:
>      (Intr)
> Month -0.001
>
> This is terrific. The estimates agree well with theory. Armed with
> confidence (or rather credibility?) I obtain interval estimate as follows:
>
> HPDinterval(mcmcsamp(anal,n=10000))
>
> The following output is obtained:
>
> $fixef
>                lower       upper
> (Intercept) 99.447112 100.5605457
> Month       -1.905921  -0.8496636
> attr(,"Probability")
> [1] 0.95
>
> $ST
>         lower     upper
> [1,] 0.7546098 0.9719147
> [2,] 0.6913345 1.7690498
> attr(,"Probability")
> [1] 0.95
>
> $sigma
>        lower    upper
> [1,] 1.710004 2.169471
> attr(,"Probability")
> [1] 0.95
>
> While the fixed effect posteriors compare very well with truth and with
> lmer point estimates, the $sigma is strange. I expect it to be close to 1
> since this is the error SD used to generate the data and also the value
> reported by lmer. Instead $sigma contains a values between 1.7 and 2.2.
> Why?

Most likely because there is a mistake in the mcmcsamp function at
present.  I have seen the same thing when trying to get an MCMC sample
from a model with two uncorrelated random effects per group.  I'm not
sure if the problem is in the theory or in the implementation but
there is definitely a problem.  (I hope it is in the implementation
and not the theory.)

> Also, I expect the $ST slot to bear some resemblance to the true SDs for
> intercept and slope (the other 2 random parameters in the model).  I
> expect values near 4 and 2 respectively, not near 0.8 and 1.1
> respectively. I realize there may be some "common scale" factor involved
> here, but I cannot rationalize how to transform $ST into something I can
> use. This is particularly vexing since the "common scale factor" ($sigma)
> bears little resemblence to expectation. Further, the common scale factor
> is actually a paramter and therefore has a distribution; a simple
> multiplication by a point estimate of sigma would not seem appropriate to
> obtain posteriors for SD.slope and SD.intercept.

Yes.  What is supposed to happen is that the product of an element of
the ST array (these are the parameters that determine the relative
standard deviations of the random effects) and the corresponding
element of sigma would give you the value of the standard deviation of
the random effects.

> The kind readers who have followed me through this will realize I am quite
> clueless. Possibly I have not found the right background material (though
> I have searched hard. Prof Bates kindly provided me some useful
> presentations). Or possibly, mcmcsamp() is simply not ready for prime
> time. I can live with that - for now. Maybe I am using the functions
> inappropriately.

The current implementation of mcmcsamp is not ready for prime time.

> I can obtain posterior samples using the PRIOR statement of SAS Proc Mixed
> for a wide class of variance component models. The difficulty with SAS is
> that technical descriptions of the methodology are not forthcoming, SAS
> has not been very progressive, the modeling syntax is confusing and
> non-intuitive, one is quickly limited by the very painful SAS graphics,
> and - of course - SAS is not free.
>
> Any suggestions, comments, or insight would be greatly appreciated. This
> lmeX library could be extremely useful to me in my nonclinical statistical
> work and I really appreciate that it is under development. I hope to
> understand it better and thank you in advance for your help!!
>
> Best regards,
>
> Dave LeBlond
> Sr. Statistician, Non-Clinical Statistics
> email: david.leblond at abbott.com
> voice mail: 847-935-6031
> Dept R436, Bldg AP9A-1
> Abbott Laboratories
> 100 Abbott Park Road
> Abbott Park, IL 60064-3500
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>