[R-sig-ME] Help with lme modelling

Wed Apr 1 21:15:57 CEST 2009

Hello,

this doesn't directly answer your questions, but you have got only two
levels for your random effect (i.e. two labs). It is not advisable to fit
mixed effect models with only two levels of the grouping factor (you should
have at least 5 - 6 or so), as the variances can not be reliably estimated.
There have been posts by Prof Bates on this issue, you can find them in the
archives. Thus, you might consider lm models (or gls()), e.g. with lab as
block factor?

HTH

Cheers,

Luca

----- Original Message ----- 
From: "Sanja Rogic" <rogic at bioinformatics.ubc.ca>
To: <r-sig-mixed-models at r-project.org>
Sent: Wednesday, April 01, 2009 2:44 PM
Subject: [R-sig-ME] Help with lme modelling

>I would like to apologize in advance if this is not the right forum
> for posting this message, since I am using nlme instead of lme4
> package, but it seamed to me that this is the best place for Mixed
> Effects Models (MEM) questions.
>
> I've recently started using MEM for the differential expression
> analysis of genes in cases where I want to analyze compatible datasets
> from different experiments/labs (modelling labs as random effects). I
> was pretty happy with the results since MEM allows for direct use of
> raw expression data, unlike some other meta-analysis approaches and
> thus seem to be more sensitive.
>
> However, I discovered several cases where MEM failed to make correct
> predictions and I would like to find a way to automatically detect
> these in the future (I am doing the analysis on thousands of genes so
> it is not feasible to graphically inspect the quality of a model fit
> for each one separately).
>
> Here is an example:
>
> #data frame
>
>> data
>          expr.val   lab treatment   age
> X1        4.751217     A       sal   old
> X2        4.660610     A       sal   old
> X3        4.810334     A       sal   old
> X4        5.235475     A        KA   old
> X5        5.414665     A        KA   old
> X6        4.451858     A        KA   old
> X7        4.812522     A       sal young
> X8        6.346419     A       sal young
> X9        5.267596     A       sal young
> X10       3.823031     A        KA young
> X11       3.619814     A        KA young
> X12       3.539568     A        KA young
> X13       5.671991     B       sal   old
> X14       5.715370     B       sal   old
> X15       5.694324     B       sal   old
> X16       5.457560     B        KA   old
> X17       5.454112     B        KA   old
> X18       5.430781     B        KA   old
>
> As you can see the data comes from two labs, A and B, there are two
> treatments, KA and sal, and the effect of these is my main interest
> and there are also two age groups.
>
> Fitting linear MEM (I expect some interaction between two fixed
> effects, treatment and age):
>
>> data.lme<- lme(expr.val~treatment*age,data=data,random=~1|lab)
>
>> summary(data.lme)
> Linear mixed-effects model fit by REML
> Data: m
>       AIC      BIC    logLik
>  33.76913 37.60347 -10.88456
>
> Random effects:
> Formula: ~1 | Study
>        (Intercept)  Residual
> StdDev:   0.4553264 0.3960901
>
> Fixed effects: E ~ Treatment * Age
>                          Value Std.Error DF   t-value p-value
> (Intercept)            5.240742 0.3602901 13 14.545894  0.0000
> Treatmentsal          -0.023434 0.2286827 13 -0.102475  0.9199
> Ageyoung              -1.276539 0.3000890 13 -4.253867  0.0009
> Treatmentsal:Ageyoung  1.838143 0.3960901 13  4.640719  0.0005
> Correlation:
>                      (Intr) Trtmnt Ageyng
> Treatmentsal          -0.317
> Ageyoung              -0.242  0.381
> Treatmentsal:Ageyoung  0.183 -0.577 -0.660
>
> Standardized Within-Group Residuals:
>       Min         Q1        Med         Q3        Max
> -1.6738372 -0.3845786 -0.2229425  0.4311380  2.1987599
>
> Number of Observations: 18
> Number of Groups: 2
>
>> anova(data.lme)
>              numDF denDF   F-value p-value
> (Intercept)       1    13 232.76945  <.0001
> Treatment         1    13   9.96020  0.0076
> Age               1    13   2.51385  0.1369
> Treatment:Age     1    13  21.53627  0.0005
>
>
>
>
> What I do when automatically analyzing all the genes is to look at the
> p-value from anova function to assess the significance of Treatment
> effect and at the sign of estimated parameter (Treatmentsal) from the
> summary function to decide on the direction of change (is it more
> expressed after sal or KA treatment). In the above example the results
> indicate that Treatment has significant effect and that there is more
> expression for KA effect (Treatmentsal=-0.023434). However, when I
> plot the raw data and superimpose the fitted values it is obvious that
> the fit is not good for KA-old datapoints and this is probably caused
> by contradictory data wrt to treatment effect between the two labs
> (sal<KA for A and sal>KA for B).
>
> My questions are:
>
> 1. How is the F-value in  anova function computed? I would like to
> understand why I got such a small value when data is contradictory.
>
> 2. The Treatmentsal value seems to correspond only to the age=old
> (change from a baseline treatment=KA, age=old). How do I compute the
> Treatmentsal value for both age groups?
>
> 3. In general, what functions/parameters can I used in automated way
> to decide if there is a significant effect of, for example, treatment
> KA?
>
> Sorry for the lengthy post and superficial understanding of how MEM
> work (not much background in statistics). I would very much appreciate
> any help on these issues.
>
> Cheers,
> Sanja Rogic
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>