[BioC] Anova; can I use lmFit and eBayes from Limma, or should it be aov?
Gordon K Smyth
smyth at wehi.EDU.AU
Tue Nov 20 03:08:57 CET 2012
Dear John,
What limma does is always equivalent to Anova. Of course it is not
classical Anova, but rather an empirical Bayes extension of Anova that is
appropriate for microarray data.
Post hoc tests are done in limma using decideTests(), and many options are
offered. You won't find classical methods like TukeyHSD though, because
limma isn't doing classical Anova and because methods like TukeyHSD don't
generalize well to high-dimensional datasets like microarrays.
There are a couple of problems with your code below.
Firstly, you have overlooked the fact that limma, unlike classical
Anova, does not automatically adjust for the intercept term. (This is for
historical reasons because limma was originally developed for two colour
microarrays.) Instead limma expects you to specify which coefficients or
contrasts you want to test for. When you run
topTable(fit2)
without specifying any coefficients, limma will test whether any of the
coefficients are different from zero, including the intercept term.
You can see that it has done this because it has included a column for
the intercept in the toptable. This is not what you want.
Secondly, you have failed to specify in your model that patient should be
a factor in your model rather than a numerical covariate. It is also
unclear to me whether time should be a factor or a covariate. At the
moment both are being treated as a covariates.
It is a good idea to look at colnames(design) or colnames(fit) to check
that the terms in the model are what you intended.
Once you fix your model, you can conductor Anova-style F-tests using limma
simply by specifying a range of coefficients to topTable(), as discussed
in the User's Guide.
Best wishes
Gordon
> Date: Mon, 19 Nov 2012 10:43:35 +0000
> From: john herbert <arraystruggles at gmail.com>
> To: Bioconductor mailing list <bioconductor at r-project.org>,
> bioconductor at stat.math.ethz.ch
> Subject: [BioC] Anova; can I use lmFit and eBayes from Limma, or
> should it be aov?
>
> Dear Bioconductors,
> This may be more a question of statistics understanding but it is
> related to Limma.
>
> Is it possible to perform an Anova using functions from the Limma
> package? If so, how can a post hoc test be applied to the eBayes
> "MArrayLM" resulting object?
>
> For me, it is a complex design I have array data, different clients,
> time course, stage of disease and treatment. The Limma manual eludes
> to Limma being equivalent to Anova in some cases for single channel
> arrays (on pages 35 and 41). I am using single channel data.
>
> This code runs;
>
> dd <- data.frame(patient=targets$patient,stage=as.factor(targets$stage),treat=as.factor(targets$treat),time=targets$time)
>
> head(dd)
>
> patient stage treat time
> 1 1 A Y 24
> 2 1 A Y 72
> 3 2 B N 24
> 4 2 B N 72
> 5 2 B N 0
> 6 2 B Y 72
>
> design <- model.matrix(~patient+stage*treat*time,data=dd)
>
> fit <- lmFit(temp,design)
> fit2 <- eBayes(fit)
>
> topTable(fit2)
> ID X.Intercept. patient stageB
> stageC treatY time stageB.treatY stageC.treatY
> stageB.time stageC.time
> A_23_P113716 A_23_P113716 17.57842 -0.0042698998 -0.18798801
> -0.18726778 -0.008996453 1.282717e-03 0.38532184 0.18501711
> 0.0025256233 0.0005692329
> A_23_P77779 A_23_P77779 17.07476 0.0015734113 -0.14199010
> -0.07687905 -0.132257119 8.288697e-05 0.04068160 0.03375537
> 0.0024200711 0.0014022044
>
> treatY.time stageB.treatY.time stageC.treatY.time
> AveExpr F P.Value adj.P.Val
> A_23_P113716 -1.340200e-03 -0.0072744604 -0.0022265946
> 17.49773 144347.32 6.481891e-159 2.657575e-154
> A_23_P77779 -8.857291e-04 0.0008166484 -0.0004867999
> 17.01806 98921.73 8.930817e-153 1.830817e-148
>
>
> How do I interpret the results, if I am performing Anova in Limma
> correctly? What are the numbers for X.intercept and stageB etc?
>
> If I take all probes into a variable with
> top_data <- topTable(fit2,n=200000000)
> and look for ones that are not significant, I don't find any.
> which(top_data$adj.P.Val > 0.01)
> integer(0)
>
> So it looks like every single probe varies significantly between any
> of the conditions, which my instincts say is not correct.
>
> Maybe I should run aov like this instead? But there is still a problem
> of post hoc testing as
>
> aof2 <- function(x,y){
> aov(x ~ stage*treat*time+Error(patient), y)
> }
> anovaresults <- apply(temp, 1, aof2,dd)
>
> lapply(anovaresults,TukeyHSD)
> Error in UseMethod("TukeyHSD") :
> no applicable method for 'TukeyHSD' applied to an object of class
> "c('aovlist', 'listof')"
>
> If anyone knows how to solve either of these issues, it will be appreciated.
>
> Thanks,
>
> John.
>
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list