[R-sig-ME] variance structure
Jarrod Hadfield
j.hadfield at ed.ac.uk
Wed Apr 6 22:02:03 CEST 2011
Hi Cristina,
CC-ing back to the list ....
With a bivariate model you can include a covariance between the random
effects for zero/not-zero and the actual value of the response when
not zero. In two univariate analyses you assume the covariance is
zero. Its a small difference.
Below is an example with some made up data, its quite involved.
Cheers,
Jarrod
library(MASS)
library(MCMCglmm)
G<-matrix(c(1,0.5,0.5,1),2,2)
E<-diag(2)
# generate covariance matrices for random effects (G) and residuals (E)
u<-mvrnorm(40, c(0,0), G)
e<-mvrnorm(400, c(0,0), E)
# sample random effects and residuals
group<-gl(40,10)
# group identifier
y1<-u[,1][group]+e[,1]
# Gaussian data if it had been observed
y2<-rbinom(400, 1, plogis(u[,2][group]+e[,2]))
# binary trait 1 = Gaussian data observed 0 = recorded as zero.
y1<-y1[-which(y2==0)]
# remove Gaussian data that has been set to zero.
dat<-data.frame(y=c(y1,y2), trait=c(rep("nzero", length(y1)),
rep("zero", length(y2))), family=c(rep("gaussian", length(y1)),
rep("categorical", length(y2))), group=c(group[-which(y2==0)], group))
# This is the odd bit! Pass the Gaussian data and the binary data as a
single response y, indicate which process the data belong to in trait,
and then specify the family each datum comes from.
prior=list(R=list(V=diag(2), nu=0.002, fix=2),
G=list(G1=list(V=diag(2), nu=2, alpha.mu=rep(0,2),
alpha.V=diag(2)*1000)))
# set up a prior - the key thing is that the observation-level
variance of a binary variable cannot be estimated so I 've set it to
one.
m1<-MCMCglmm(y~trait-1, random=~us(trait):group,
rcov=~idh(trait):units, data=dat, family=NULL, prior=prior)
# The model has two fixed effects, where the first is the intercept
for the Gauusian data and the second is the intercpet for the binary
variable (on the logit scale).
# us(trait):group estimates a variance for the u's of each process AND
te covariance. idh(trait):units fixes the residual covariance between
the wto processes to zero (it cannot be estimated)
# family=NULL indicates that you have specified a distribution for
each datum in the data.frame
# You will get an error message if for some groups all data are zero,
or all data are non-zero. If you decide to go ahead with the analysis
(again I think it is overkill) and you encounter this problem I will
show you how to get round it.
Quoting Cristina Gomes <gomes_cristina at ymail.com> on Wed, 6 Apr 2011
12:00:51 -0700 (PDT):
> Thank you very much for your quick and helpful answer Jarred!
>
> The non-zero data are well separated from the zeros, but what is the actual
> difference between doing a bivariate binary/gaussian model and what
> I did (i.e.
> running two models: one Logistic with a binary response and the
> whole data set,
> and one Gaussian excluding the zeros and using only the rest of the
> data which
> is normally distributed), and how would I implement the bivariate model? Is
> there a straightforward method or would I have to model it?
>
> Thanks again!
> Cheers,
> Cristina.
>
>
>
>
>
> ________________________________
> From: Jarrod Hadfield <j.hadfield at ed.ac.uk>
> To: Ben Bolker <bbolker at gmail.com>
> Cc: Cristina Gomes <gomes_cristina at ymail.com>;
> r-sig-mixed-models at r-project.org
> Sent: Wed, April 6, 2011 2:03:42 PM
> Subject: Re: [R-sig-ME] variance structure
>
> Hi Cristina,
>
> I agree with Ben. In addition, MCMCglmm will not fit ZIP models to
> these data
> (because the data are not integers) and ZIG (zero-inflated Gaussian)
> models are
> not implemented. In fact, I can't really see what the ZIG likelihood
> would look
> like, but anyway ...
>
> If the non-zero data are well separated from the zero's (i.e. if pnorm(0,
> mean(y[which(y!=0)]), sd(y[which(y!=0)])) is small) then fitting a bivariate
> binary/gaussian model is one option, but perhaps more complex than
> your problem
> requires.
>
> Cheers,
>
> Jarrod
>
>
>
>
>
>
>
>
>
> On 6 Apr 2011, at 02:51, Ben Bolker wrote:
>
>> On 11-04-05 06:30 PM, Cristina Gomes wrote:
>>> Hello,
>>>
>>> I'm running a model where the response is normally distributed and has an
>>> excess
>>> of zeros. Because I was familiar with the lmer package I used it to run a
> GLMM
>>> on this response, and addressed the problem of the excess of zeros
>>> by running
>>> two models: one with the response as a binary one, using the complete data
>> set,
>>> and another excluding all the zeros and using the remaining
>>> response values in
>>> a
>>> Gaussian model. This seemed to work fine. However, a reviewer
>>> suggested using
>>> the whole data set and a zero-inflated poisson error structure in the
> MCMCglmm
>>> package. I don’t know if this is appropriate as my response are
>>> rates (grams
>>> of
>>> meat consumed per hr of observation) and not discrete values.
>>>
>>
>> I would give advice on how to get zero-inflated models working with
>> MCMCglmm (mainly, see Ch 5 of the "CourseNotes" vignette that comes with
>> MCMCglmm, but I think the reviewer is wrong to think that you could use
>> a zero-inflated Poisson. I think the way you did it is fine. However,
>> if you *wanted* to be fancy you might (after careful reading of Ch 5,
>> and thought) be able to set up a zero-inflated normal model in MCMCglmm
>> in a way analogous to the way zero-inflated Poissons are set up.
>>
>> Ben Bolker
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>
>
> --The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the R-sig-mixed-models
mailing list