[R-sig-ME] lme4 and calculating QAICc

Tue Feb 23 16:08:05 CET 2010

Dear R-helpers,

some -late- reactions on the dicussion between Marcus Rowcliffe & Ben Bolker:

Marcus Rowcliffe wrote:
> Many thanks Ben
> 
> The squaring of the scale parameter is a bit of a surprise! It seems 
> to give sensible results (see below *), but I can't immediately see 
> why it's necessary - is there an obvious explanation?

Ben Bolker answered:
>  Because c-hat is supposed to be on the same scale as the deviance -- the deviance is essentially on a variance, or 
> sum-of-squares scale, not a standard deviation/root-mean-square scale.  @sigma is defined as a standard deviation, not
> a variance.

>  For what it's worth, Doug Bates has said in the past that he's not
> *sure* that sigma really corresponds to the same quantity that we would want to use as sqrt(c-hat) -- it seems 
> reasonable, but no-one to my knowledge has either sat down and worked through it carefully *or* tested with
>simulations.
> Any comment/insight appreciated.

I actually sent an e-mail directly to Ben Bolker on these issue earlier this year, reaching the same conclusion that the scale parameter should be squared. Here was my rationale:

> I however went to question this choice: indeed, sigma is a standard 
> deviation, whereas if I understand Burnham & Anderson books correctly 
> (pp.68-69), c-hat should rather be a variance (seen both through its 
> effect on the variance estimators and on the way c-hat is calculated 
> from the chi square statistic). Then: shouldn't we use the square of 
> the sigma, and not sigma itsel: i.e.:
> 
> phi = (lme4:::sigma(mq1))^2
> 
> ? This should correspond to the variance parameter of the "Residual"
> line in the summary of the lmer function.

I however agree with Ben Bolker's remark that this should be studied more closely.

Marcus Rowcliffe wrote:
> 2.	Comparing poisson and quasipoisson models, I note that the
> former has no residual random variance, and hence one fewer degrees of 
> freedom than the quasipoisson. Why is this? It obviously has important 
> implications for the QAIC calculation.

Ben Bolker replied:
>  I find this a bit difficult myself.  Technically, it's another 'estimated parameter', but it's a nuisance parameter 
> that does *not* affect the predictions at all (it is simply calculated from the residuals of the model), so I'm not 
> really sure whether it should affect the estimate of expected Kullback-Leibler distance or not ... however, I would 
> quibble a little bit with "important implications for the QAIC calculation" -- it's just 1 degree of freedom!  To 
> quote _Numerical Recipes in C_ 2d ed. p. 611 (Press et al): "We might also comment that if the difference between N 
> and N-1 ever matters to you, then you are probably up to no good anyway -- e.g., trying to substantiate a questionable 
> hypothesis with marginal data.")

I also suggested this in my former message to Ben Bolker. Burnham & Anderson (2002, p.69) specify that we should include the dispersion parameter in the number of parameters used in the QAIC/AIC formula. Surely the quotation from Numerical Recipes in C has a part of truth. However, a Bayesian might have a different perspective on the link between the estimation of the nuisance parameter and the precision of the predictions. I would therefore rather add an extra parameter when changing from Poisson to quasi-Poisson...

I join at the end of the message a proposal of program for the calculus of QAIC for different kinds of mixed models, under different versions of R and probably also for S-Plus: any remark is welcome. This refers to another AIC.lme program that I can also share. I also have ones for QAICc and QAICu.

Two final remarks on quasi-likelihoods:
	(i) take care of the version of R on which you use it: http://markmail.org/message/s4abxhhdacqjkunm

	(ii) the case of under-dispersed data relative to the Poisson distribution is not clear:  Burnham & Anderson (2002, p.69) say that in such cases we should take c-hat=1 (or dispersion parameter), with no reason why. I personnally keep the calculated sigma in the calculus of the QAIC below. Here too some more work is needed.

All the best,

Frédéric Gosselin
Engineer & Researcher (PhD) in Forest Ecology
Cemagref
Domaine des Barres
F-45290 Nogent sur Vernisson
France

***********************************************************************************************************************************************************************************************************************************************

"QAIC.lme"<-
function(x, c = NULL,mod.df = 0,mod.N=0)
{

	#number of parameters should include the estimated dispersion (Bunrham & Anderson 2002, p.69)
	# (c being fixed for a class of models to be compared; cf. Venables & Ripley)
	#mod.df and mod.N are for "manual" modifications of the numbers of degrees of freedom or numbers of observations
	if(!is.element(class(x), c("glmer","lmer","mer", "nlme", "gls", "gnls"))) stop(
			"Object x not lmer, nlme, gls or gnls")
if (is.null(c))
{
if ((is.null(c))& (!is.null(attributes(x)$dev["sigmaML"])))
{
c<-attributes(x)$dev["sigmaML"]^2
}
else
{
if ((is.null(c))& (!is.null(attributes(summary(x))$devComp["scale"])))
{
c<-attributes(summary(x))$devComp["scale"]^2
}
else
{
if ((is.null(c))& (!is.null(attributes(summary(x))$sigma)))
{
c<-attributes(summary(x))$sigma^2
}

else
{

}
}
}
}

if (!is.element(class(x), c("glmer","lmer","mer"))){K <- attributes(logLik(x))$df + mod.df} else {K <- attributes(logLik(x))$df + mod.df}

if (is.R())
{#adding 1 in this case because in these old versions the estimation of the dispersion parameter/innermost variance is not included in the count; unclear how it works for R2.6...
if (as.double(R.Version()$minor)<7.0&c!=1&is.element(class(x), c("glmer","lmer","mer"))){K<-K+1}}

#n<-attributes(logLik(x))$nall
if(!is.element(class(x), c("glmer","lmer","mer"))){n <- x$dims$N+mod.N
if(x$method == "REML") {
print("Object x fitted with REML; take care that models have similar fixed effects and contrasts before comparing them with AIC"
)}} else {n <- attributes(logLik(x))$nall+mod.N}

#here, we will only divide the logLik by the "dispersion parameter" (a variance) if this is a quasi binomial or Poisson (and not a general gaussian like quasi)
correct.logLik<-F
if (is.R())
{
family.ref<-attributes(summary(x))$family
if (is.null(family.ref)){family.ref<-paste(as.character(attributes(summary(x))$call),sep=" ")}
if (!is.null(family.ref)){
if (max(c(max(sapply(family.ref,function(x){if(is.character(x)){regexpr("quasi",x)>0}else{F}})),
max(sapply(family.ref,function(x){if(is.character(x)){regexpr("Quasi",x)>0}else{F}}))))==T & max(c(max(sapply(family.ref,function(x){if(is.character(x)){regexpr("logit",x)>0}else{F}})),
max(sapply(family.ref,function(x){if(is.character(x)){regexpr("Logit",x)>0}else{F}})),
max(sapply(family.ref,function(x){if(is.character(x)){regexpr("log",x)>0}else{F}})),
max(sapply(family.ref,function(x){if(is.character(x)){regexpr("Log",x)>0}else{F}})),
max(sapply(family.ref,function(x){if(is.character(x)){regexpr("poisson",x)>0}else{F}})),
max(sapply(family.ref,function(x){if(is.character(x)){regexpr("Poisson",x)>0}else{F}})),
max(sapply(family.ref,function(x){if(is.character(x)){regexpr("binomial",x)>0}else{F}})),
max(sapply(family.ref,function(x){if(is.character(x)){regexpr("Binomial",x)>0}else{F}}))))==T)

correct.logLik<-T}

#we also allow corrected AICc=> QAICc in case of models fitted with binomial or Poisson:
if (!is.null(family.ref)){
if (max(c(max(sapply(family.ref,function(x){if(is.character(x)){regexpr("poisson",x)>0}else{F}})),
max(sapply(family.ref,function(x){if(is.character(x)){regexpr("Poisson",x)>0}else{F}})),
max(sapply(family.ref,function(x){if(is.character(x)){regexpr("binomial",x)>0}else{F}})),
max(sapply(family.ref,function(x){if(is.character(x)){regexpr("Binomial",x)>0}else{F}}))))==T)

correct.logLik<-T}

if (is.element(class(x), c("glmer","lmer","mer")) & correct.logLik & (as.double(R.Version()$minor)>=7.0&R.Version()$major=="2"))
{print("lmer does not work well with quasi family for this R version")
stop("cf. http://markmail.org/message/s4abxhhdacqjkunm")} 

}

	if (!is.null(c) & correct.logLik==T)
		{

#condition removed for working with lmer & co: attributes(x)[["modelStruct
	QAIC <- (-2 * logLik(x)[[1]] )/c + 2 * K

	}
	else
	{QAIC <- AIC.lme(x,mod.df,mod.N)

	}

	QAIC
}