[R-sig-ME] variance structure

Wed Apr 6 22:02:03 CEST 2011

Hi Cristina,

CC-ing back to the list ....

With a bivariate model you can include a covariance between the random  
effects for zero/not-zero and the actual value of the response when  
not zero. In two univariate analyses you assume the covariance is  
zero. Its a small difference.

Below is an example with some made up data, its quite involved.

Cheers,

Jarrod

library(MASS)
library(MCMCglmm)

G<-matrix(c(1,0.5,0.5,1),2,2)
E<-diag(2)

# generate covariance matrices for random effects (G) and residuals (E)

u<-mvrnorm(40, c(0,0), G)
e<-mvrnorm(400, c(0,0), E)

# sample random effects and residuals

group<-gl(40,10)

# group identifier

y1<-u[,1][group]+e[,1]

# Gaussian data if it had been observed

y2<-rbinom(400, 1, plogis(u[,2][group]+e[,2]))

# binary trait 1 = Gaussian data observed 0 = recorded as zero.

y1<-y1[-which(y2==0)]

# remove Gaussian data that has been set to zero.

dat<-data.frame(y=c(y1,y2), trait=c(rep("nzero", length(y1)),  
rep("zero", length(y2))),  family=c(rep("gaussian", length(y1)),  
rep("categorical", length(y2))), group=c(group[-which(y2==0)], group))

# This is the odd bit! Pass the Gaussian data and the binary data as a  
single response y, indicate which process the data belong to in trait,  
and then specify the family each datum comes from.

prior=list(R=list(V=diag(2), nu=0.002, fix=2),  
G=list(G1=list(V=diag(2), nu=2, alpha.mu=rep(0,2),  
alpha.V=diag(2)*1000)))

# set up a prior - the key thing is that the observation-level  
variance of a binary variable cannot be estimated so I 've set it to  
one.

m1<-MCMCglmm(y~trait-1, random=~us(trait):group,  
rcov=~idh(trait):units, data=dat, family=NULL, prior=prior)

# The model has two fixed effects, where the first is the intercept  
for the Gauusian data and the second is the intercpet for the binary  
variable (on the logit scale).

# us(trait):group estimates a variance for the u's of each process AND  
te covariance. idh(trait):units fixes the residual covariance between  
the wto processes to zero (it cannot be estimated)

# family=NULL indicates that you have specified a distribution for  
each datum in the data.frame

# You will get an error message if for some groups all data are zero,  
or all data are non-zero. If you decide to go ahead with the analysis  
(again I think it is overkill) and you encounter this problem I will  
show you how to get round it.

Quoting Cristina Gomes <gomes_cristina at ymail.com> on Wed, 6 Apr 2011  
12:00:51 -0700 (PDT):

> Thank you very much for your quick and helpful answer Jarred!
>
> The non-zero data are well separated from the zeros, but what is the actual
> difference between doing a bivariate binary/gaussian model and what  
> I did (i.e.
> running two models: one Logistic with a binary response and the  
> whole data set,
> and one Gaussian excluding the zeros and using only the rest of the  
> data which
> is normally distributed), and how would I implement the bivariate model? Is
> there a straightforward method or would I have to model it?
>
> Thanks again!
> Cheers,
> Cristina.
>
>
>
>
>
> ________________________________
> From: Jarrod Hadfield <j.hadfield at ed.ac.uk>
> To: Ben Bolker <bbolker at gmail.com>
> Cc: Cristina Gomes <gomes_cristina at ymail.com>;  
> r-sig-mixed-models at r-project.org
> Sent: Wed, April 6, 2011 2:03:42 PM
> Subject: Re: [R-sig-ME] variance structure
>
> Hi Cristina,
>
> I agree with Ben. In addition,  MCMCglmm will not fit ZIP models to  
> these data
> (because the data are not integers) and ZIG (zero-inflated Gaussian)  
> models are
> not implemented. In fact, I can't really see what the ZIG likelihood  
> would look
> like, but anyway ...
>
> If the non-zero data are well separated from the zero's (i.e. if pnorm(0,
> mean(y[which(y!=0)]), sd(y[which(y!=0)]))  is small) then fitting a bivariate
> binary/gaussian model is one option, but perhaps more complex than  
> your problem
> requires.
>
> Cheers,
>
> Jarrod
>
>
>
>
>
>
>
>
>
> On 6 Apr 2011, at 02:51, Ben Bolker wrote:
>
>> On 11-04-05 06:30 PM, Cristina Gomes wrote:
>>> Hello,
>>>
>>> I'm running a model where the response is normally distributed and has an
>>> excess
>>> of zeros. Because I was familiar with the lmer package I used it to run a
> GLMM
>>> on this response, and addressed the problem of the excess of zeros  
>>> by running
>>> two models: one with the response as a binary one, using the complete data
>> set,
>>> and another excluding all the zeros and using the remaining  
>>> response values in
>>> a
>>> Gaussian model. This seemed to work fine. However, a reviewer  
>>> suggested using
>>> the whole data set and a zero-inflated poisson error structure in the
> MCMCglmm
>>> package. I donâ€™t know if this is appropriate as my response are  
>>> rates (grams
>>> of
>>> meat consumed per hr of observation) and not discrete values.
>>>
>>
>>  I would give advice on how to get zero-inflated models working with
>> MCMCglmm (mainly, see Ch 5 of the "CourseNotes" vignette that comes with
>> MCMCglmm, but I think the reviewer is wrong to think that you could use
>> a zero-inflated Poisson.  I think the way you did it is fine.  However,
>> if you *wanted* to be fancy you might (after careful reading of Ch 5,
>> and thought) be able to set up a zero-inflated normal model in MCMCglmm
>> in a way analogous to the way zero-inflated Poissons are set up.
>>
>>  Ben Bolker
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>
> --The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.