[R-sig-ME] mixed-effects model specification question
Andrew Robinson
A.Robinson at ms.unimelb.edu.au
Wed May 7 10:45:19 CEST 2008
Hi Mark,
On Wed, May 07, 2008 at 01:57:41AM -0400, Mark Kimpel wrote:
> Andrew, our emails must have crossed on the net. You have come to
> essentially the same conclusion as I, that Tissue/Region should be
> discarded as an effect and the 3 samples per rat treated equally. What
> you seem to be suggesting, however, is that averaging these 3 samples
> to come up with one summary measure per animal, and thus performing a
> t-test on 2 groups of 6 animals, would be the same as using all 36
> measurements with rat as a random effect. I'm not a statistician, but
> it seems to me that those 3 measurements within each rat give some
> information about the reliability of those measurements and thus would
> contribute useful information to the model. If, for example, the
> variance within animal was much greater than the variance between
> animals, then even if the means between the groups were different, I
> would have less confidence that the population means were different
> than I would if the measurements were tightly grouped within animal.
> (I'm ignoring, for simplification, the within-group between-animal
> variance).
yes, I see your reasoning. I suspect that the within-rat variance
would have to be quite large to make an effect, but as you say, you
can try it both ways.
Cheers
Andrew
> I guess the only way to know for sure would be to do it both ways and
> see if the conclusions were any different. Since I will be repeating
> this over 31,000 genes, any difference between the approaches should be
> apparent.
> Mark
>
> On Wed, May 7, 2008 at 1:35 AM, Andrew Robinson
> <[1]A.Robinson at ms.unimelb.edu.au> wrote:
>
> On Tue, May 06, 2008 at 08:56:51PM -0400, Mark Kimpel wrote:
> > The "treatments" are actually two selectively bred lines of rats,
> with
> > 6 animals from each line. We need p-values as a filter to
> determine
> > which of ~31,000 genes on the array are significant. The p-values
> are
> > subjected to FDR correction, of course. I've never seen a
> micro-array
> > paper published without p-values.
> > As for Andrew's comments:
> > 1. yes, there are three measurments taken per rat, but they are
> from
> > different brain regions. Since each brain region was processed
> > separately, however, the region and batch effects cannot be
> separated.
> > Having said that, I still label this factor as "region" in my
> model.
>
> I'm a bit confused still - sorry - do you therefore have three
> batches
> per rat, for a total of 36 batches (with two missing possibly)? If
> so
> then I still think that your model is over-parametrized, and that
> you
> can eliminate the Tissue random effect, based on your model output
> below (thanks for including it).
>
> > 2. In the past I have done exactly as you suggested and averaged
> brain
> > regions, but I was hoping nlme would allow for more
> sophistication.
> > Because certain genes vary greatly across brain regions, averaging
> > would seem to discard some useful information which could be used
> to
> > improve the inference made on treatment effect.
>
> Can you specify exactly what useful information is being used in the
> analysis that would be discarded if you took averages?
>
> > If I used averaging, I would just have 12 measurements made on 12
> > rats and could do a simple t-test, but I don't think that would
> > be preferable, would it?
>
> It's not clear to me that the conclusions will be different. If
> not,
> then the simpler analysis seems to be the better one. I would
> suggest
> that you compare them and see what happens.
> Cheers,
> Andrew
>
> > Here is the output of my model:
> > myModel <- lme(fixed = geneExpression ~ Treatment, random =
> ~1 |
> > Rat/Tissue)
> > > myModel
> > Linear mixed-effects model fit by REML
> > Data: NULL
> > Log-restricted-likelihood: 40.60593
> > Fixed: geneExpression ~ Treatment
> > (Intercept) TreatmentLAD
> > 12.628465893 -0.009280945
> > Random effects:
> > Formula: ~1 | Rat
> > (Intercept)
> > StdDev: 2.752519e-06
> > Formula: ~1 | Tissue %in% Rat
> > (Intercept) Residual
> > StdDev: 0.06226097 0.0002623086
> > Number of Observations: 34
> > Number of Groups:
> > Rat Tissue %in% Rat
> > 12 34
> > > summary(myModel)
> > Linear mixed-effects model fit by REML
> > Data: NULL
> > AIC BIC logLik
> > -71.21185 -63.88317 40.60593
> > Random effects:
> > Formula: ~1 | Rat
> > (Intercept)
> > StdDev: 2.752519e-06
> > Formula: ~1 | Tissue %in% Rat
> > (Intercept) Residual
> > StdDev: 0.06226097 0.0002623086
> > Fixed effects: geneExpression ~ Treatment
> > Value Std.Error DF t-value p-value
> > (Intercept) 12.628466 0.01510064 22 836.2870 0.0000
> > TreatmentLAD -0.009281 0.02135553 10 -0.4346 0.6731
> > Correlation:
> > (Intr)
> > TreatmentLAD -0.707
> > Standardized Within-Group Residuals:
> > Min Q1 Med Q3
> Max
> > -0.0083222376 -0.0027523180 -0.0001242287 0.0034693142
> 0.0087149017
> > Number of Observations: 34
> > Number of Groups:
> > Rat Tissue %in% Rat
> > 12 34
> > >
> >
>
> > On Tue, May 6, 2008 at 4:42 PM,
> <[1][2]Antonio_Paredes at aphis.usda.gov>
>
> > wrote:
> >
> > How were animals allocated to treatments (at random)? What about
> the
> > housing of the animals. Both of these factors could be
> influential
> > in selecting the unit of study.
> > You also need to get an ideal at to why a p-value is enough to
> > support the objective of the study. Why not estimation of
> treatment
> > effects?
> > Tony.
> >
>
> > Andrew Robinson <[2][3]A.Robinson at ms.unimelb.edu.au>
> > Sent by: [3][4]r-sig-mixed-models-bounces at r-project.org
>
> >
> > 05/06/2008 03:19 PM
> >
> >
> To
> >
>
> > Mark Kimpel <[4][5]mwkimpel at gmail.com>
> >
> >
> cc
> >
> > [5][6]r-sig-mixed-models at r-project.org
>
> >
> >
> Subject
> >
> > Re: [R-sig-ME] mixed-effects model specification question
> >
> > Mark,
> > You should talk to a local statistician about this, but I think
> that
> > you can probably average across the measurements within each rat,
> if
> > all you are interested in is a treatment effect. For the analysis
> of
> > treatment the relevant unit of replication should be the rat, in
> any
> > case.
> > (Does anyone have any thoughts on why that might be a bad idea?)
> > Also if I understand your design, there are three batches per rat.
> I
> > suspect that using Rat/Tissue would lead to an over-parametrized
> > model, if my interpretation is correct. My guess is that Rat
> should
> > be adequate.
> > Final points
> > 1) My speculations would be better-informed if you showed us the
> > output of the model that you proposed - fyi :) .
> > 2) You could try all three configurations and see if it makes any
> > difference to the inference of interest. I suspect that it will
> > not.
> > I hope that this helps,
> > Andrew
> > On Tue, May 06, 2008 at 10:23:08AM -0400, Mark Kimpel wrote:
> > > Perhaps a bit of a newbie question, but I need to get this
> right. I
> > need to
> > > make sure I am specifying a model correctly. Here is our
> > less-than-perfect
> > > experimental design:
> > >
> > > 36 rats divided into two treatment groups, i.e. 18 per group
> > >
> > > each rat has measurements taken on 3 brain regions, but each
> brain
> > region
> > > was analyzed in a separate batch (there are strong batch
> effects) so
> > we
> > > really can't compare the regions per se, but do recognize that
> the 3
> > regions
> > > from a single rat have variance above and beyond that accounted
> for
> > by
> > > technical factors. Since, because of the batch effect we are not
> > going to
> > > look at the effect of brain region, I "think" this should be a
> > considered a
> > > random effect.
> > >
> > > So I have rat, treatment, and region(batch) as variables. The
> only
> > thing I
> > > am interested in getting a p-value for is treatment.
> > >
> > > Is the model below correct and can I squeek by with using nlme
> to get
> > a
> > > p-value (notwithstanding recent threads on this list)?
> > >
> > > myModel <- lme(fixed = geneExpression ~ Treatment, random =
> ~1 |
> > > Rat/Tissue)
> > >
> > > Thanks,
> > > Mark
> > > --
> > > Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
> > > Indiana University School of Medicine
> > >
> > > 15032 Hunter Court, Westfield, IN 46074
> > >
> > > (317) 490-5129 Work, & Mobile & VoiceMail
> > > (317) 663-0513 Home (no voice mail please)
> > >
> > >
> ******************************************************************
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > _______________________________________________
>
> > > [6][7]R-sig-mixed-models at r-project.org mailing list
> > >
> [7][8]https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> > --
> > Andrew Robinson
> > Department of Mathematics and Statistics Tel:
> > +61-3-8344-6410
> > University of Melbourne, VIC 3010 Australia Fax:
> > +61-3-8344-4599
>
> > [8][9]http://www.ms.unimelb.edu.au/~andrewpr
> > [9][10]http://blogs.mbs.edu/fishing-in-the-bay/
> > _______________________________________________
> > [10][11]R-sig-mixed-models at r-project.org mailing list
> >
> [11][12]https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> >
> > --
> > Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
> > Indiana University School of Medicine
> > 15032 Hunter Court, Westfield, IN 46074
> > (317) 490-5129 Work, & Mobile & VoiceMail
> > (317) 663-0513 Home (no voice mail please)
> > **********************************************************
> ********
> >
>
> > References
> >
> > 1. mailto:[13]Antonio_Paredes at aphis.usda.gov
> > 2. mailto:[14]A.Robinson at ms.unimelb.edu.au
> > 3. mailto:[15]r-sig-mixed-models-bounces at r-project.org
> > 4. mailto:[16]mwkimpel at gmail.com
> > 5. mailto:[17]r-sig-mixed-models at r-project.org
> > 6. mailto:[18]R-sig-mixed-models at r-project.org
> > 7. [19]https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> > 8. [20]http://www.ms.unimelb.edu.au/%7Eandrewpr
> > 9. [21]http://blogs.mbs.edu/fishing-in-the-bay/
> > 10. mailto:[22]R-sig-mixed-models at r-project.org
> > 11. [23]https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> --
> Andrew Robinson
> Department of Mathematics and Statistics Tel:
> +61-3-8344-6410
> University of Melbourne, VIC 3010 Australia Fax:
> +61-3-8344-4599
> [24]http://www.ms.unimelb.edu.au/~andrewpr
> [25]http://blogs.mbs.edu/fishing-in-the-bay/
>
> --
> Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
> Indiana University School of Medicine
> 15032 Hunter Court, Westfield, IN 46074
> (317) 490-5129 Work, & Mobile & VoiceMail
> (317) 663-0513 Home (no voice mail please)
> ******************************************************************
>
> References
>
> 1. mailto:A.Robinson at ms.unimelb.edu.au
> 2. mailto:Antonio_Paredes at aphis.usda.gov
> 3. mailto:A.Robinson at ms.unimelb.edu.au
> 4. mailto:r-sig-mixed-models-bounces at r-project.org
> 5. mailto:mwkimpel at gmail.com
> 6. mailto:r-sig-mixed-models at r-project.org
> 7. mailto:R-sig-mixed-models at r-project.org
> 8. https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 9. http://www.ms.unimelb.edu.au/%7Eandrewpr
> 10. http://blogs.mbs.edu/fishing-in-the-bay/
> 11. mailto:R-sig-mixed-models at r-project.org
> 12. https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 13. mailto:Antonio_Paredes at aphis.usda.gov
> 14. mailto:A.Robinson at ms.unimelb.edu.au
> 15. mailto:r-sig-mixed-models-bounces at r-project.org
> 16. mailto:mwkimpel at gmail.com
> 17. mailto:r-sig-mixed-models at r-project.org
> 18. mailto:R-sig-mixed-models at r-project.org
> 19. https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 20. http://www.ms.unimelb.edu.au/%7Eandrewpr
> 21. http://blogs.mbs.edu/fishing-in-the-bay/
> 22. mailto:R-sig-mixed-models at r-project.org
> 23. https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 24. http://www.ms.unimelb.edu.au/%7Eandrewpr
> 25. http://blogs.mbs.edu/fishing-in-the-bay/
--
Andrew Robinson
Department of Mathematics and Statistics Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/
More information about the R-sig-mixed-models
mailing list