[R-sig-ME] zero variance query

Emmanuel Charpentier charpent at bacbuc.dyndns.org
Tue Jun 2 20:07:56 CEST 2009


Le mardi 02 juin 2009 à 10:45 +0100, Christine Griffiths a écrit :
> Dear Emmanuel and Ben
> 
> Many thanks for your advice. Unfortunately, I don't think that I can offset 
> with log(area), given that each area is the same. My rationale for 
> converting to m2 was to standardise abundances to 1 m2 as I have other 
> parameters which were measured to different areas. I had previously 
> attempted to normalise my data by logging but felt that it did not improve 
> the distribution. I just hadn't tried it in my modelling. Logging my count 
> data dramatically improved the fit of the model (AIC 116.7 v 312.5), 
> however the variance still remains low. Does this appear acceptable?
> Furthermore, can I assess model fit of different transformations of the 
> same dataset using AIC values, i.e. compare log(Count) and inverse 
> transformed Count?
> 
> lncount<-log(Count+1)
> m1<-m1<-lmer(lncount~Treatment+(1|Month)+(1|Block),family=quasipoisson)

Aaargh ! I thought that "lmer(lncount~Treatment+(1|Month)+(1|
Block),family=gaussian)" (+/- log(area somewhere in the model or the
fit) might give good hints as a first (known bad) approximation, and
didn't made myself clear. That's what happens when you try to be
sarcastic while dog-tired...

And maybe your random effects *are* quite low. What happens with :
summary(glm(Count~Treatment+Month+Block, family=quasipoisson) (or
something to that effect) ? What says a simple crosstabulation ? or a
graph ("caterpillar", for example) ?

HTH,

					Emmanuel Charpentier


> summary(m1)
> Generalized linear mixed model fit by the Laplace approximation
> Formula: lncount ~ Treatment + (1 | Month) + (1 | Block)
>    AIC   BIC logLik deviance
>  116.7 135.1 -52.33    104.7
> Random effects:
>  Groups   Name        Variance   Std.Dev.
>  Month    (Intercept) 1.8937e-14 1.3761e-07
>  Block    (Intercept) 3.5018e-02 1.8713e-01
>  Residual             3.9318e-01 6.2704e-01
> Number of obs: 160, groups: Month, 10; Block, 6
> 
> Fixed effects:
>                    Estimate Std. Error t value
> (Intercept)         -0.4004     0.1239  -3.232
> Treatment2.Radiata   0.4596     0.1305   3.522
> Treatment3.Aldabra   0.4295     0.1334   3.220
> 
> Correlation of Fixed Effects:
>             (Intr) Trt2.R
> Trtmnt2.Rdt -0.581
> Trtmnt3.Ald -0.577  0.530
> 
> I used quasipoisson as my data is overdispersed. It was further improved by 
> an inverse transformation (AIC 43.54). Again I have small variances.
> 
> invcount<-1/(Count+1)
> m3<-lmer(invcount~Treatment+(1|Month)+(1|Block),family=quasipoisson)
> summary(m3)
> Generalized linear mixed model fit by the Laplace approximation
> Formula: invcount ~ Treatment + (1 | Month) + (1 | Block)
>    AIC BIC logLik deviance
>  43.54  62 -15.77    31.54
> Random effects:
>  Groups   Name        Variance  Std.Dev.
>  Month    (Intercept) 0.0000000 0.000000
>  Block    (Intercept) 0.0021038 0.045867
>  Residual             0.0926225 0.304339
> Number of obs: 160, groups: Month, 10; Block, 6
> 
> Fixed effects:
>                    Estimate Std. Error t value
> (Intercept)        -0.51644    0.05411  -9.545
> Treatment2.Radiata -0.36246    0.08401  -4.314
> Treatment3.Aldabra -0.29319    0.08197  -3.577
> 
> Correlation of Fixed Effects:
>             (Intr) Trt2.R
> Trtmnt2.Rdt -0.566
> Trtmnt3.Ald -0.580  0.372
> 
> Log(Abundance) did not solve the problem of zero variance. If quasipoisson 
> errors are not acceptable to use with abundance, i.e. non-integers, is 
> there a family of errors that would be recommended? Or should I simply 
> multiply abundance to obtain whole numbers?
> 
> Many thanks in advance,
> Christine
> 
> 
> --On 01 June 2009 23:17 -0400 Ben Bolker <bolker-hnhdhkBXzx8 at public.gmane.org> wrote:
> 
> > Emmanuel Charpentier wrote:
> >> Le lundi 01 juin 2009 à 18:00 +0100, Christine Griffiths a écrit :
> >>> Dear R users,
> >>>
> >>> I am having a problem with getting zero variance in my lmer models
> >>> which  specify two random effects. Having scoured the help lists, I
> >>> have read that  this could be because my variables are strongly
> >>> correlated. However, when I  simplify my model I still encounter the
> >>> same problem.
> >>>
> >>> My response variable is abundance which ranges from 0-0.14.
> >>>
> >>> Below is an example of my model:
> >>>> m1<-lmer(Abundance~Treatment+(1|Month)+(1|Block),family=quasipoisson)
> >>>> summary(m1)
> >>> Generalized linear mixed model fit by the Laplace approximation
> >>> Formula: Abundance ~ Treatment + (1 | Month) + (1 | Block)
> >>>    AIC   BIC logLik deviance
> >>>  17.55 36.00 -2.777    5.554
> >>> Random effects:
> >>>  Groups   Name        Variance   Std.Dev.
> >>>  Month    (Intercept) 5.1704e-17 7.1906e-09
> >>>  Block    (Intercept) 0.0000e+00 0.0000e+00
> >>>  Residual             1.0695e-03 3.2704e-02
> >>> Number of obs: 160, groups: Month, 10; Block, 6
> >>>
> >>> Fixed effects:
> >>>                    Estimate Std. Error t value
> >>> (Intercept)        -3.73144    0.02728 -136.80
> >>> Treatment2.Radiata  0.58779    0.03521   16.69
> >>> Treatment3.Aldabra  0.47269    0.03606   13.11
> >>>
> >>> Correlation of Fixed Effects:
> >>>             (Intr) Trt2.R
> >>> Trtmnt2.Rdt -0.775
> >>> Trtmnt3.Ald -0.756  0.586
> >>>
> >>> 1. Is it wrong to treat this as count data?
> >>
> >> Hmmm... IST vaguely R that, when the world was young and I was (already)
> >> silly, Poisson distribution used to be a *discrete* distribution. Of
> >> course, this may or may not stand for "quasi"Poisson (for some value of
> >> "quasi").
> >>
> >> May I inquire if you tried to analyze log(Abundance) (or log(Count),
> >> maybe including log(area) in the model) ?
> >>
> >> HTH,
> >>
> >> 					Emmanuel Charpentier
> >>
> >>> 2. I would like to retain these as random factors given that I designed
> >>> my  experiment as a randomised block design and repeated measures,
> >>> albeit  non-orthogonal and unbalanced. Is it acceptable to retain these
> >>> random  factors, is all else is correct?
> >
> >    I think so ...
> >
> >>> 3. The above response variable was calculated per m2 by dividing the
> >>> Count  by the sample area. When I used the Count (range 0-9) as my
> >>> response  variable, I get a small but reasonable variation of random
> >>> effects. Could  anyone explain why this occurs and whether one response
> >>> variable is better  than another?
> >
> >   To agree with what Emmanuel said above: you should use Count~...,
> > offset=log(area) for the correct analysis ...  that should solve
> > both your technical (zero random effects) and conceptual (even
> > quasiPoisson should be discrete data) issues.
> >
> >>>
> >>>> m2<-lmer(Count~Treatment+(1|Month)+(1|Block),family=quasipoisson)
> >>>> summary(m2)
> >>> Generalized linear mixed model fit by the Laplace approximation
> >>> Formula: Count ~ Treatment + (1 | Month) + (1 | Block)
> >>>    AIC BIC logLik deviance
> >>>  312.5 331 -150.3    300.5
> >>> Random effects:
> >>>  Groups   Name        Variance Std.Dev.
> >>>  Month    (Intercept) 0.14591  0.38198
> >>>  Block    (Intercept) 0.58690  0.76609
> >>>  Residual             2.79816  1.67277
> >>> Number of obs: 160, groups: Month, 10; Block, 6
> >>>
> >>> Fixed effects:
> >>>                    Estimate Std. Error t value
> >>> (Intercept)          0.3098     0.3799  0.8155
> >>> Treatment2.Radiata   0.5879     0.2299  2.5575
> >>> Treatment3.Aldabra   0.5745     0.2382  2.4117
> >>>
> >>> Correlation of Fixed Effects:
> >>>             (Intr) Trt2.R
> >>> Trtmnt2.Rdt -0.347
> >>> Trtmnt3.Ald -0.348  0.536
> >>>
> >>> Many thanks,
> >>> Christine
> >>>
> >>
> >> _______________________________________________
> >> R-sig-mixed-models at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
> >
> > --
> > Ben Bolker
> > Associate professor, Biology Dep't, Univ. of Florida
> > bolker at ufl.edu / www.zoology.ufl.edu/bolker
> > GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc
> >
> > _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
> 
> 
> ----------------------
> Christine Griffiths
> School of Biological Sciences
> University of Bristol
> Woodland Road
> Bristol BS8 1UG
> Tel: 0117 9287593
> Fax 0117 925 7374
> Christine.Griffiths at bristol.ac.uk
> http://www.bio.bris.ac.uk/research/mammal/tortoises.html
> 




More information about the R-sig-mixed-models mailing list