[R-sig-ME] zero variance query

Wed Jun 3 12:05:28 CEST 2009

Dear All

Thank you for all your incredibly helpful advice. Having done as suggested 
by Emmanuel, I have come to the conclusion that there is basically very 
little variation among my data, which is perhaps not surprising given that 
I have a small count range 0-9. I was just concerned that such low variance 
meant that there was something wrong with my analysis.

Emmanuel, logging my data does not improve the distribution and so I think 
I cannot use Gaussian errors. I have a large number of zeros. These are off 
biological significance and so I am interested in retaining them in the 
model.

Since Anna mentioned that quasi models are unstable in lme4, would you 
recommend using poisson models and calculating QAIC instead?

Many thanks
Christine

--On 02 June 2009 20:07 +0200 Emmanuel Charpentier 
<charpent at bacbuc.dyndns.org> wrote:

> Le mardi 02 juin 2009 à 10:45 +0100, Christine Griffiths a écrit :
>> Dear Emmanuel and Ben
>>
>> Many thanks for your advice. Unfortunately, I don't think that I can
>> offset  with log(area), given that each area is the same. My rationale
>> for  converting to m2 was to standardise abundances to 1 m2 as I have
>> other  parameters which were measured to different areas. I had
>> previously  attempted to normalise my data by logging but felt that it
>> did not improve  the distribution. I just hadn't tried it in my
>> modelling. Logging my count  data dramatically improved the fit of the
>> model (AIC 116.7 v 312.5),  however the variance still remains low. Does
>> this appear acceptable? Furthermore, can I assess model fit of different
>> transformations of the  same dataset using AIC values, i.e. compare
>> log(Count) and inverse  transformed Count?
>>
>> lncount<-log(Count+1)
>> m1<-m1<-lmer(lncount~Treatment+(1|Month)+(1|Block),family=quasipoisson)
>
> Aaargh ! I thought that "lmer(lncount~Treatment+(1|Month)+(1|
> Block),family=gaussian)" (+/- log(area somewhere in the model or the
> fit) might give good hints as a first (known bad) approximation, and
> didn't made myself clear. That's what happens when you try to be
> sarcastic while dog-tired...
>
> And maybe your random effects *are* quite low. What happens with :
> summary(glm(Count~Treatment+Month+Block, family=quasipoisson) (or
> something to that effect) ? What says a simple crosstabulation ? or a
> graph ("caterpillar", for example) ?
>
> HTH,
>
> 					Emmanuel Charpentier
>
>
>> summary(m1)
>> Generalized linear mixed model fit by the Laplace approximation
>> Formula: lncount ~ Treatment + (1 | Month) + (1 | Block)
>>    AIC   BIC logLik deviance
>>  116.7 135.1 -52.33    104.7
>> Random effects:
>>  Groups   Name        Variance   Std.Dev.
>>  Month    (Intercept) 1.8937e-14 1.3761e-07
>>  Block    (Intercept) 3.5018e-02 1.8713e-01
>>  Residual             3.9318e-01 6.2704e-01
>> Number of obs: 160, groups: Month, 10; Block, 6
>>
>> Fixed effects:
>>                    Estimate Std. Error t value
>> (Intercept)         -0.4004     0.1239  -3.232
>> Treatment2.Radiata   0.4596     0.1305   3.522
>> Treatment3.Aldabra   0.4295     0.1334   3.220
>>
>> Correlation of Fixed Effects:
>>             (Intr) Trt2.R
>> Trtmnt2.Rdt -0.581
>> Trtmnt3.Ald -0.577  0.530
>>
>> I used quasipoisson as my data is overdispersed. It was further improved
>> by  an inverse transformation (AIC 43.54). Again I have small variances.
>>
>> invcount<-1/(Count+1)
>> m3<-lmer(invcount~Treatment+(1|Month)+(1|Block),family=quasipoisson)
>> summary(m3)
>> Generalized linear mixed model fit by the Laplace approximation
>> Formula: invcount ~ Treatment + (1 | Month) + (1 | Block)
>>    AIC BIC logLik deviance
>>  43.54  62 -15.77    31.54
>> Random effects:
>>  Groups   Name        Variance  Std.Dev.
>>  Month    (Intercept) 0.0000000 0.000000
>>  Block    (Intercept) 0.0021038 0.045867
>>  Residual             0.0926225 0.304339
>> Number of obs: 160, groups: Month, 10; Block, 6
>>
>> Fixed effects:
>>                    Estimate Std. Error t value
>> (Intercept)        -0.51644    0.05411  -9.545
>> Treatment2.Radiata -0.36246    0.08401  -4.314
>> Treatment3.Aldabra -0.29319    0.08197  -3.577
>>
>> Correlation of Fixed Effects:
>>             (Intr) Trt2.R
>> Trtmnt2.Rdt -0.566
>> Trtmnt3.Ald -0.580  0.372
>>
>> Log(Abundance) did not solve the problem of zero variance. If
>> quasipoisson  errors are not acceptable to use with abundance, i.e.
>> non-integers, is  there a family of errors that would be recommended? Or
>> should I simply  multiply abundance to obtain whole numbers?
>>
>> Many thanks in advance,
>> Christine
>>
>>
>> --On 01 June 2009 23:17 -0400 Ben Bolker
>> <bolker-hnhdhkBXzx8 at public.gmane.org> wrote:
>>
>> > Emmanuel Charpentier wrote:
>> >> Le lundi 01 juin 2009 à 18:00 +0100, Christine Griffiths a écrit :
>> >>> Dear R users,
>> >>>
>> >>> I am having a problem with getting zero variance in my lmer models
>> >>> which  specify two random effects. Having scoured the help lists, I
>> >>> have read that  this could be because my variables are strongly
>> >>> correlated. However, when I  simplify my model I still encounter the
>> >>> same problem.
>> >>>
>> >>> My response variable is abundance which ranges from 0-0.14.
>> >>>
>> >>> Below is an example of my model:
>> >>>> m1<-lmer(Abundance~Treatment+(1|Month)+(1|Block),family=quasipoisso
>> >>>> n) summary(m1)
>> >>> Generalized linear mixed model fit by the Laplace approximation
>> >>> Formula: Abundance ~ Treatment + (1 | Month) + (1 | Block)
>> >>>    AIC   BIC logLik deviance
>> >>>  17.55 36.00 -2.777    5.554
>> >>> Random effects:
>> >>>  Groups   Name        Variance   Std.Dev.
>> >>>  Month    (Intercept) 5.1704e-17 7.1906e-09
>> >>>  Block    (Intercept) 0.0000e+00 0.0000e+00
>> >>>  Residual             1.0695e-03 3.2704e-02
>> >>> Number of obs: 160, groups: Month, 10; Block, 6
>> >>>
>> >>> Fixed effects:
>> >>>                    Estimate Std. Error t value
>> >>> (Intercept)        -3.73144    0.02728 -136.80
>> >>> Treatment2.Radiata  0.58779    0.03521   16.69
>> >>> Treatment3.Aldabra  0.47269    0.03606   13.11
>> >>>
>> >>> Correlation of Fixed Effects:
>> >>>             (Intr) Trt2.R
>> >>> Trtmnt2.Rdt -0.775
>> >>> Trtmnt3.Ald -0.756  0.586
>> >>>
>> >>> 1. Is it wrong to treat this as count data?
>> >>
>> >> Hmmm... IST vaguely R that, when the world was young and I was
>> >> (already) silly, Poisson distribution used to be a *discrete*
>> >> distribution. Of course, this may or may not stand for "quasi"Poisson
>> >> (for some value of "quasi").
>> >>
>> >> May I inquire if you tried to analyze log(Abundance) (or log(Count),
>> >> maybe including log(area) in the model) ?
>> >>
>> >> HTH,
>> >>
>> >> 					Emmanuel Charpentier
>> >>
>> >>> 2. I would like to retain these as random factors given that I
>> >>> designed my  experiment as a randomised block design and repeated
>> >>> measures, albeit  non-orthogonal and unbalanced. Is it acceptable to
>> >>> retain these random  factors, is all else is correct?
>> >
>> >    I think so ...
>> >
>> >>> 3. The above response variable was calculated per m2 by dividing the
>> >>> Count  by the sample area. When I used the Count (range 0-9) as my
>> >>> response  variable, I get a small but reasonable variation of random
>> >>> effects. Could  anyone explain why this occurs and whether one
>> >>> response variable is better  than another?
>> >
>> >   To agree with what Emmanuel said above: you should use Count~...,
>> > offset=log(area) for the correct analysis ...  that should solve
>> > both your technical (zero random effects) and conceptual (even
>> > quasiPoisson should be discrete data) issues.
>> >
>> >>>
>> >>>> m2<-lmer(Count~Treatment+(1|Month)+(1|Block),family=quasipoisson)
>> >>>> summary(m2)
>> >>> Generalized linear mixed model fit by the Laplace approximation
>> >>> Formula: Count ~ Treatment + (1 | Month) + (1 | Block)
>> >>>    AIC BIC logLik deviance
>> >>>  312.5 331 -150.3    300.5
>> >>> Random effects:
>> >>>  Groups   Name        Variance Std.Dev.
>> >>>  Month    (Intercept) 0.14591  0.38198
>> >>>  Block    (Intercept) 0.58690  0.76609
>> >>>  Residual             2.79816  1.67277
>> >>> Number of obs: 160, groups: Month, 10; Block, 6
>> >>>
>> >>> Fixed effects:
>> >>>                    Estimate Std. Error t value
>> >>> (Intercept)          0.3098     0.3799  0.8155
>> >>> Treatment2.Radiata   0.5879     0.2299  2.5575
>> >>> Treatment3.Aldabra   0.5745     0.2382  2.4117
>> >>>
>> >>> Correlation of Fixed Effects:
>> >>>             (Intr) Trt2.R
>> >>> Trtmnt2.Rdt -0.347
>> >>> Trtmnt3.Ald -0.348  0.536
>> >>>
>> >>> Many thanks,
>> >>> Christine
>> >>>
>> >>
>> >> _______________________________________________
>> >> R-sig-mixed-models at r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> >
>> >
>> > --
>> > Ben Bolker
>> > Associate professor, Biology Dep't, Univ. of Florida
>> > bolker at ufl.edu / www.zoology.ufl.edu/bolker
>> > GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc
>> >
>> > _______________________________________________
>> > R-sig-mixed-models at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>
>>
>>
>> ----------------------
>> Christine Griffiths
>> School of Biological Sciences
>> University of Bristol
>> Woodland Road
>> Bristol BS8 1UG
>> Tel: 0117 9287593
>> Fax 0117 925 7374
>> Christine.Griffiths at bristol.ac.uk
>> http://www.bio.bris.ac.uk/research/mammal/tortoises.html
>>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

----------------------
Christine Griffiths
School of Biological Sciences
University of Bristol
Woodland Road
Bristol BS8 1UG
Tel: 0117 9287593
Fax 0117 925 7374
Christine.Griffiths at bristol.ac.uk
http://www.bio.bris.ac.uk/research/mammal/tortoises.html