[R-sig-ME] mixed-effects model specification question

Wed May 7 07:35:27 CEST 2008

On Tue, May 06, 2008 at 08:56:51PM -0400, Mark Kimpel wrote:
>    The "treatments" are actually two selectively bred lines of rats, with
>    6 animals from each line. We need p-values as a filter to determine
>    which of ~31,000 genes on the array are significant. The p-values are
>    subjected to FDR correction, of course. I've never seen a micro-array
>    paper published without p-values.
>    As for Andrew's comments:
>    1. yes, there are three measurments taken per rat, but they are from
>    different brain regions. Since each brain region was processed
>    separately, however, the region and batch effects cannot be separated.
>    Having said that, I still label this factor as "region" in my model.

I'm a bit confused still - sorry - do you therefore have three batches
per rat, for a total of 36 batches (with two missing possibly)?  If so
then I still think that your model is over-parametrized, and that you
can eliminate the Tissue random effect, based on your model output
below (thanks for including it).

>    2. In the past I have done exactly as you suggested and averaged brain
>    regions, but I was hoping nlme would allow for more sophistication.
>    Because certain genes vary greatly across brain regions, averaging
>    would seem to discard some useful information which could be used to
>    improve the inference made on treatment effect. 

Can you specify exactly what useful information is being used in the
analysis that would be discarded if you took averages?

>    If I used averaging, I would just have 12 measurements made on 12
>    rats and could do a simple t-test, but I don't think that would
>    be preferable, would it?

It's not clear to me that the conclusions will be different.  If not,
then the simpler analysis seems to be the better one.  I would suggest
that you compare them and see what happens.

Cheers,

Andrew

>    Here is the output of my model:
>         myModel <- lme(fixed = geneExpression ~  Treatment,  random = ~1 |
>    Rat/Tissue)
>    > myModel
>    Linear mixed-effects model fit by REML
>      Data: NULL
>      Log-restricted-likelihood: 40.60593
>      Fixed: geneExpression ~ Treatment
>     (Intercept) TreatmentLAD
>    12.628465893 -0.009280945
>    Random effects:
>     Formula: ~1 | Rat
>             (Intercept)
>    StdDev: 2.752519e-06
>     Formula: ~1 | Tissue %in% Rat
>            (Intercept)     Residual
>    StdDev:  0.06226097 0.0002623086
>    Number of Observations: 34
>    Number of Groups:
>                Rat Tissue %in% Rat
>                 12              34
>    > summary(myModel)
>    Linear mixed-effects model fit by REML
>     Data: NULL
>            AIC       BIC   logLik
>      -71.21185 -63.88317 40.60593
>    Random effects:
>     Formula: ~1 | Rat
>             (Intercept)
>    StdDev: 2.752519e-06
>     Formula: ~1 | Tissue %in% Rat
>            (Intercept)     Residual
>    StdDev:  0.06226097 0.0002623086
>    Fixed effects: geneExpression ~ Treatment
>                     Value  Std.Error DF  t-value p-value
>    (Intercept)  12.628466 0.01510064 22 836.2870  0.0000
>    TreatmentLAD -0.009281 0.02135553 10  -0.4346  0.6731
>     Correlation:
>                 (Intr)
>    TreatmentLAD -0.707
>    Standardized Within-Group Residuals:
>              Min            Q1           Med            Q3           Max
>    -0.0083222376 -0.0027523180 -0.0001242287  0.0034693142  0.0087149017
>    Number of Observations: 34
>    Number of Groups:
>                Rat Tissue %in% Rat
>                 12              34
>    >
> 
>    On Tue, May 6, 2008 at 4:42 PM, <[1]Antonio_Paredes at aphis.usda.gov>
>    wrote:
> 
>      How were animals allocated to treatments (at random)? What about the
>      housing of the animals. Both of these factors could be influential
>      in selecting the unit of study.
>      You also need to get an ideal at to why a p-value is enough to
>      support the objective of the study. Why not estimation of treatment
>      effects?
>      Tony.
> 
>    Andrew Robinson <[2]A.Robinson at ms.unimelb.edu.au>
>    Sent by: [3]r-sig-mixed-models-bounces at r-project.org
> 
>    05/06/2008 03:19 PM
> 
>                                                                         To
> 
>    Mark Kimpel <[4]mwkimpel at gmail.com>
> 
>                                                                         cc
> 
>    [5]r-sig-mixed-models at r-project.org
> 
>                                                                    Subject
> 
>    Re: [R-sig-ME] mixed-effects model specification question
> 
>    Mark,
>    You should talk to a local statistician about this, but I think that
>    you can probably average across the measurements within each rat, if
>    all you are interested in is a treatment effect.  For the analysis of
>    treatment the relevant unit of replication should be the rat, in any
>    case.
>    (Does anyone have any thoughts on why that might be a bad idea?)
>    Also if I understand your design, there are three batches per rat.  I
>    suspect that using Rat/Tissue would lead to an over-parametrized
>    model, if my interpretation is correct.  My guess is that Rat should
>    be adequate.
>    Final points
>    1) My speculations would be better-informed if you showed us the
>      output of the model that you proposed - fyi :) .
>    2) You could try all three configurations and see if it makes any
>      difference to the inference of interest.  I suspect that it will
>      not.
>    I hope that this helps,
>    Andrew
>    On Tue, May 06, 2008 at 10:23:08AM -0400, Mark Kimpel wrote:
>    > Perhaps a bit of a newbie question, but I need to get this right. I
>    need to
>    > make sure I am specifying a model correctly. Here is our
>    less-than-perfect
>    > experimental design:
>    >
>    > 36 rats divided into two treatment groups, i.e. 18 per group
>    >
>    > each rat has measurements taken on 3 brain regions, but each brain
>    region
>    > was analyzed in a separate batch (there are strong batch effects) so
>    we
>    > really can't compare the regions per se, but do recognize that the 3
>    regions
>    > from a single rat have variance above and beyond that accounted for
>    by
>    > technical factors. Since, because of the batch effect we are not
>    going to
>    > look at the effect of brain region, I "think" this should be a
>    considered a
>    > random effect.
>    >
>    > So I have rat, treatment, and region(batch) as variables. The only
>    thing I
>    > am interested in getting a p-value for is treatment.
>    >
>    > Is the model below correct and can I squeek by with using nlme to get
>    a
>    > p-value (notwithstanding recent threads on this list)?
>    >
>    >  myModel <- lme(fixed = geneExpression ~  Treatment,  random = ~1 |
>    > Rat/Tissue)
>    >
>    > Thanks,
>    > Mark
>    > --
>    > Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
>    > Indiana University School of Medicine
>    >
>    > 15032 Hunter Court, Westfield, IN 46074
>    >
>    > (317) 490-5129 Work, & Mobile & VoiceMail
>    > (317) 663-0513 Home (no voice mail please)
>    >
>    > ******************************************************************
>    >
>    >                  [[alternative HTML version deleted]]
>    >
>    > _______________________________________________
>    > [6]R-sig-mixed-models at r-project.org mailing list
>    > [7]https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>    --
>    Andrew Robinson
>    Department of Mathematics and Statistics            Tel:
>    +61-3-8344-6410
>    University of Melbourne, VIC 3010 Australia         Fax:
>    +61-3-8344-4599
>    [8]http://www.ms.unimelb.edu.au/~andrewpr
>    [9]http://blogs.mbs.edu/fishing-in-the-bay/
>    _______________________________________________
>    [10]R-sig-mixed-models at r-project.org mailing list
>    [11]https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> 
>    --
>    Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
>    Indiana University School of Medicine
>    15032 Hunter Court, Westfield, IN 46074
>    (317) 490-5129 Work, & Mobile & VoiceMail
>    (317) 663-0513 Home (no voice mail please)
>    ********************************************************** ********
> 
> References
> 
>    1. mailto:Antonio_Paredes at aphis.usda.gov
>    2. mailto:A.Robinson at ms.unimelb.edu.au
>    3. mailto:r-sig-mixed-models-bounces at r-project.org
>    4. mailto:mwkimpel at gmail.com
>    5. mailto:r-sig-mixed-models at r-project.org
>    6. mailto:R-sig-mixed-models at r-project.org
>    7. https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>    8. http://www.ms.unimelb.edu.au/%7Eandrewpr
>    9. http://blogs.mbs.edu/fishing-in-the-bay/
>   10. mailto:R-sig-mixed-models at r-project.org
>   11. https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-- 
Andrew Robinson  
Department of Mathematics and Statistics            Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia         Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/