[R-sig-ME] Dealing with heteroscedasticity in a GLM/M (Paul Johnson)

Tue Aug 28 09:55:30 CEST 2012

Hi there,

I'm a psycholinguist, and have only recently started using logit mixed-effects to analyze categorical/binary data.

I was wondering if the issues of heteroscedasticity that have been discussed here also apply to repeated measures designs - where same participants take part in all conditions.

In experimental psycholinguistics, 2 x 2 repeated measures designs are usually favoured, and the interaction between the two factors tends to be one of the main interests of analysis. I've noticed that when the proportion of one of the conditions approaches 0 or 1, we tend to get surprising results for the interactions (something what you do not expect from patterns of means - this may be because we're looking at log-odds).  Also, if the overall mean is close to 0 or 1, coefficient estimates often do not make sense. I guess this is because the variance becomes not equaly when the proportion approaches 0 or 1.

Traditionally, ANOVAs have been the standard analytical tool for proportion data in our area (we normally apply arcsin-transformation on the proportions before submitting them to ANOVAs). But we've been recently encouraged to use mixed-effects mainly because analysing proportions violates the homogeneity of variance assumed by ANOVAs. So I'd like to know if the heterogeneity of variance also affects logit mixed-effects badly (to me, the effect seems more severe with logit mixed effects than with ANOVAs).

Any advice would be much appreciated.

Thank you for your attention.

Kumiko

Dr Kumiko Fukumura

School of Psychological Sciences and Health

University of Strathclyde

40 George Street

Glasgow, G1 1 QE

------------------------------

Message: 3
Date: Mon, 27 Aug 2012 11:53:15 -0500
From: Paul Johnson <pauljohn32 at gmail.com>
To: Leila Brook <leila.brook at my.jcu.edu.au>
Cc: "r-sig-mixed-models at r-project.org"
        <r-sig-mixed-models at r-project.org>
Subject: Re: [R-sig-ME] Dealing with heteroscedasticity in a GLM/M
Message-ID:
        <CAErODj9NbjtwrEPkczMYT-yLmTfmmu2meDyyDi0jDzRTN9T+mg at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

On Thu, Aug 23, 2012 at 12:58 AM, Leila Brook <leila.brook at my.jcu.edu.au> wrote:
> I am hoping to find a way to account for heterogeneity of variance between categories of explanatory variables in a generalised model.
>
> I have searched books and this forum, and haven't found any advice that I understood could help account for this assumption in a generalised model context, as I can't fit a variance structure in lme4.
>
>
>
> As background to my study:
>
> I used camera stations set up in pairs (one positioned on a track and one off the track) to record my study species, and used the same pairs in each of two seasons. As my surveys were repeated, I have specified camera pair as a random effect. I am using a binomial model in lme4 to model the proportion of nights an animal was recorded, as a function of the fixed effects of season (2 categories), position (categorical: whether on or off the track), area (categorical: one of two areas) and continuous habitat variables, plus interactions between them.
>
>
>
> I validated the GLM form of the model, including plotting the deviance residuals against my explanatory variables, and have noted that the variance of residuals for the categorical variables appears to differ.
>

Stop there.

In Logit/Probit frameworks, the variance is assumed equal for all
groups. It is never estimated.  The model is not identified otherwise.
The effect of heteroskedasticity is not just inefficiency, but
parameter bias. This makes logit models much more suspicious than
previously believed. This means that all of the work you have done so
far to "validate" your model is dubious and you need to take a step
back.

We are in a bind with logit models. Either we estimate separate models
for the separate groups (to avoid heteroskedasticity), but we are not
able to compare coefficients across models because there is that
different, but un-estimated variance.  Or we fit one model that
combines the group, make the wrong assumption, and end up with wrong
parameter estimates.  I don't mean just a little off. I mean wrong.
Its discouraging.

As far as I know, this problem was first popularized by Paul Allison,
Scott Long, and Richard Williams, but it is nicely surveyed in this
review essay:

Mood, Carina. 2010. Logistic Regression: Why We Cannot Do What We
Think We Can DO, and What We Can Do About It. European Sociological
Review 26(1): 67-82.

That has cites to the earlier Allison paper and some of Williams's work.

In my opinion, there are no completely safe approaches to dealing with
the heteroskedastic group-level error.  Richard Williams at Notre Dame
gave an excellent presentation about it.  He told me he has a paper
forthcoming in the Stata journal about it, but I don't feel free to
pass it along to you. But I bet his website has more information.

It seems to me that if you try to "pin" one group as the "baseline
variance" group and then add properly structured random effects for
the other ones, you might get a handle on it. The R package dglm has
suggestions like that.

Good luck. If you get an answer, I'd really like to know what is the
state of the art now (this minute)...

--
Paul E. Johnson
Professor, Political Science    Assoc. Director
1541 Lilac Lane, Room 504     Center for Research Methods
University of Kansas               University of Kansas
http://pj.freefaculty.org            http://quant.ku.edu

------------------------------

Message: 4
Date: Mon, 27 Aug 2012 18:23:41 +0100
From: Jarrod Hadfield <j.hadfield at ed.ac.uk>
To: Jackson King <jackson.king722 at gmail.com>
Cc: R-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] MCMCglmm multinomial logit
Message-ID: <20120827182341.78871bjh8feszoyo at www.staffmail.ed.ac.uk>
Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes";
        format="flowed"

Hi Jackson,

If x1 is the fixed effect prediction for the log contrast B-A then

plogis(x1/sqrt(1+v1*c2))

gives the expected probability of being B rather than A after
integrating over the random effects (in your case the residuals under
the assumption that they have unit variance v1=1). You can write it
another way:

Have nu1 = x1+e1

plogis(x1/sqrt(1+v1*c2)) \approx  int plogis(nu1)pnorm(e1,0,sqrt(v1))de

and we can see the approximation is pretty good:

x1<-2
c2<-0.345843
v1<-1

plogis(x1/sqrt(1+v1*c2))
integrate(function(e1){plogis(x1+e1)*dnorm(e1,0,sqrt(v1))}, -Inf,Inf)

Obtaining the expected probability of being B rather than A, or C
rather than A, is therefore relatively straightforward.

Obtaining the expected probability of B rather than C is a bit more tricky.

take nu2 = log contrast C-A

nu1-nu2 = log contrast B-C

which is

x1+e1-x2+e2

and is distributed as N(x1-x2, V[1,1]+V[2,2]-2*V[1,2]) where in you case V=R.

We can therefore use the same approximation as before replacing x1
with x1-x2 and v1 with V[1,1]+V[2,2]-2*V[1,2].

If we chose V = IJ  then V[1,1] = V[2,2] = V[1,1]+V[2,2]-2*V[1,2] and
so we multiply c2 by the same constant for all contrasts. This was the
main motivation. In fact, it might make more sense to scale IJ by
1/(j-1) rather than 1/j so all contrasts have unit variance, but
anything could be used really as long as you are careful.

Looking at it now, the justification that these results/approximations
extend to the case A versus (B and C) as I suggest in the CourseNotes
seems more tenuous and I will check it out.

Cheers,

Jarrod

Quoting Jackson King <jackson.king722 at gmail.com> on Sun, 26 Aug 2012
09:28:45 -0500:

> Hello,
>
> I am attempting to fit a multinomial logit model (3 categories) with random
> effects (across states). I have attempted to follow the advice in the
> course notes, but am a little uncertain the reason for the priors on the
> residuals, and how this is used to calculate predicted probabilities. I
> would like my covariates to predict each outcome -- cov1 predicting option3
> versus option1 and cov1 predicting option2 versus option1, rather than a
> main effect.
>
> Here is a simple version of my model
>
> j<-length(levels(data$char))
>
> I<-diag<-(j-1) #2x2 identity matrix
>
> J=matrix(rep(1, (j-1)^2), c(j-1, j-1)) #2x2 Unit Matrix
>
> IJ <- (1/3) * (diag(2) + matrix(1, 2, 2)) #Residual covriance matrix
>
> prior = list(R = list(V =IJ, fix = 1), G = list(G1 = list(V = diag(2), nu =
> 0.002))) #Why can't I use V=diag(2) for prior R?
>
>
> model<- MCMCglmm(char~-1+trait*(cov1), random=~idh(trait):state, rcov = ~us(
> trait):units, data=data, family = "categorical", prior = prior, verbose =
> TRUE)
>
>
> I find that the model converges, and the coefficients look similar to
> maximum likelihood, but then I would like to predict probabilities for
> being in each category. Typically, I think this is done by
> plogis(x/sqrt(1+c2)), so why is it necessary to multiply by the delta
> matrix (course notes p. 97)? Alternatively, if I simply use a 2x2 diagonal
> matrix for the prior for R, shouldn't I be able to use the same
> transformation -- plogis(x/sqrt(1+c2)). In short, I am a little confused
> about the IJ matrix and where it comes from. Is there a quick answer, or
> another paper that explains this? And (2) is it reasonable to predict
> probabilities from my model, based on fixed values of cov1, using this
> simple transformation above... plogis(x/sqrt(1+c2))
>
> Many thanks.
>
> Jackson
>
>       [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

------------------------------

Message: 5
Date: Mon, 27 Aug 2012 18:26:25 +0100
From: Jarrod Hadfield <j.hadfield at ed.ac.uk>
To: Emmanuel Curis <curis at pharmacie.univ-paris5.fr>
Cc: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Questions about MCMCglmm and marker data
Message-ID: <20120827182625.8866720cbjenthoo at www.staffmail.ed.ac.uk>
Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes";
        format="flowed"

Hi,

It is the effects that are to be considered normal, not the way the
genotypes are coded.

It is not possible to fit this model in the current version of
MCMCglmm, but it will be available in the next. I hope to submit this
version to CRAN in the next couple of weeks.

Cheers,

Jarrod

Quoting Emmanuel Curis <curis at pharmacie.univ-paris5.fr> on Thu, 23 Aug
2012 12:10:46 +0200:

> Hello,
>
> I'm not familiar with this kind of genomic problems, but if the
> columns are the number of a given SNP for a given individual and can
> be only 0, 1 and 2, is this really possible to consider they follow a
> normal distribution?
>
> On Thu, Aug 23, 2012 at 11:54:49AM +0200, Marie Denis wrote:
> ? Hi,
> ?
> ? I use the MCMCglmm function in the genomic selection context in the
> ? univariate case. In fact I have for each trait one marker matrix
> ? constituted of 0,1 and 2. The rows are the individuals and the columns
> ? the SNPs. In a first time, we consider that each SNP follow a normal
> ? distribution with the *same *variance. So I use the following model:
> ?
> ? prior.1.1 <- list(G=list(G1=list(V=diag(x = as.numeric(scale), nrow=1,
> ? ncol=1),nu=ddl),
> ?                            R=list(V=matrix(scale),nu=ddl))
> ?
> ? mcmc.fit.1.1 <- MCMCglmm(P~ 1,random=~idv(SNP),prior=prior.1.1,
> ?                    data=data1.1,
> ?                    nitt=5000, burnin=1000,verbose=FALSE,
> ?                    thin=10,pr=TRUE)
> ?
> ? So, I obtained a common variance associated to my SNPs.
> ?
> ?
> ? The second step is a bivariate analysis. I would like to obtain a
> ? (co)variance matrix 2*2  associated to the trait1 and trait 2 and the
> ? correlation between both. (one variance for the SNPs for the trait 1 and
> ? one for the trait 2). But I don't know how i can do. The following model
> ? give me only one variance for all SNPs and both traits.
> ?
> ?      prior.3<- list(G=list(G1=list(V=diag(x = as.numeric(scale), nrow=1,
> ?                            ncol=1),n=1)),
> ?                R=list(V=diag(x=as.numeric(0.1),nrow=2,ncol=2),n=2))
> ?
> ?      mcmc.fit.3 <- MCMCglmm(cbind(P1,P2)~ trait-1,random=~idv(trait:SNP),
> ?                    rcov=~idh(trait):units,
> ?                    prior=prior.3,
> ?                    data=data1.3,family=c("gaussian","gaussian"),
> ?                    nitt=2000, burnin=500,verbose=FALSE,
> ?                    thin=10,pr=TRUE)
> ?
> ?
> ?
> ? thanks for your help,
> ?
> ?
> ?
> ?
> ?     [[alternative HTML version deleted]]
> ?
> ? _______________________________________________
> ? R-sig-mixed-models at r-project.org mailing list
> ? https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> --
>                                 Emmanuel CURIS
>                                 emmanuel.curis at univ-paris5.fr
>
> Page WWW: http://emmanuel.curis.online.fr/index.html
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

------------------------------

Message: 6
Date: Mon, 27 Aug 2012 00:10:30 -0500
From: Mikhail Matz <matz at utexas.edu>
To: "r-sig-mixed-models at r-project.org"
        <r-sig-mixed-models at r-project.org>
Subject: [R-sig-ME] bayes factors for MCMCglmm
Message-ID: <C657AD3B-2DEE-4373-93D4-E465212DCEF4 at utexas.edu>
Content-Type: text/plain; charset=us-ascii

Hello - is it possible to calculate hayes factors for MCMCglmm objects, to compare alternative models?

Mikhail

------------------------------

Message: 7
Date: Tue, 28 Aug 2012 00:21:23 +0000 (UTC)
From: Ben Bolker <bbolker at gmail.com>
To: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] who shot the predict method (and did you shoot
        the     deputy as well?)
Message-ID: <loom.20120828T011347-203 at post.gmane.org>
Content-Type: text/plain; charset=us-ascii

Paul Johnson <pauljohn32 at ...> writes:

>
> It seems like I can't leave for a month or two without pieces of mount
> Rushmore falling off of lme4.
>
> Remember last June, I posted about some troubles with the predict
> method and you fixed it.

  Quick question: did predict() *ever* exist for the stable version
of lme4?  (I don't think so, although I may of course be mistaken ...)
  Were you working with the development (r-forge) version
of lme4 before?

  I will take a look at your code ...

  Ben Bolker

>
> And I had some other trouble with fishy looking variance estimates
> from models specified like (1|id) + (0 +x|id).
> I stopped fighting with that because I just couldn't understand it at
> all, but today I took it up again.
>
> Now the fishy looking variance estimates are completely solved
> (yeah!), but somebody stole the predict function entirely:
>
> > m3newdat$mm4pred <- predict(mm4, newdata = m3newdat)
> Error in UseMethod("predict") :
>   no applicable method for 'predict' applied to an object of class "mer"
>
> Have I gotten shunted to the wrong lme4 packaging?

------------------------------

_______________________________________________
R-sig-mixed-models mailing list
R-sig-mixed-models at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

End of R-sig-mixed-models Digest, Vol 68, Issue 32