[BioC] LIMMA: testing for batch effects
Gordon Smyth
smyth at wehi.edu.au
Thu Apr 13 01:00:17 CEST 2006
>Date: Tue, 11 Apr 2006 15:08:55 +0000
>From: Adaikalavan Ramasamy <ramasamy at cancer.org.uk>
>Subject: [BioC] LIMMA: testing for batch effects
>To: BioConductor mailing list <bioconductor at stat.math.ethz.ch>
>
>Dear all,
>
>We have 63 arrays that are either diseased or normal. We wish to select
>genes adjusting for two covariates : gender (male/female) and
>experimental batch (one/two/three).
>
> >From biological knowledge, we expect the batch effect to be significant
>but we wish to quantify it numerically. Here is a way that we tried and
>the problems we faced. We searched the archives without much success.
>
>
> library(limma) # version 2.3.3
> dd <- model.matrix( ~ disease + gender + batch, cl )
> head( dd )
> (Intercept) diseasenormal gendermale batchtwo batchthree
> 2405 1 1 0 0 0
> 2408 1 1 0 0 0
> 2410 1 1 0 0 0
> GER15 1 0 0 0 0
> GER20 1 0 0 0 1
> GER22 1 0 0 0 0
>
> fit <- lmFit(mds.rma, dd)
>
>
>1) Is the following the correct way of setting up contrasts to find the
>three pairwise comparison between batches ?
>
> contr.m <- cbind( Batch2minus1=c(0,0,0,1,0), Batch2minus1=c(0,0,0,0,1),
> Batch3minus2=c(0,0,0,-1,1) )
>
>
>2) From reading the LIMMA user guide, we think decideTests() could be
>potentially useful.
>
> fit2 <- eBayes( contrasts.fit( fit, contr.m ) )
> a <- decideTests( fit2, method="global" )
> summary(a)
> Batch2minus1 Batch2minus1 Batch3minus2
> -1 15151 13838 4919
> 0 26574 30561 46703
> 1 12950 10276 3053
>
>Can we say that somewhere between 8000 - 27000 genes are affected by at
>least one batch. Or is there a nicer/proper way of explaining this to a
>biologist.
>
>
>3) Ideally we would like to use method="nestedF" as suggested but we get
>the following error message. Can anyone explain what might possibly be
>going wrong.
>
> b <- decideTests( fit2, method="nestedF" )
> Error in if (crossprod(crossprod(Q, x)) > qF[i]) { :
> missing value where TRUE/FALSE needed
This was a bug that persisted for a short time after the built-in R
function p.adjust() changed from "fdr" to "BH" in R 2.2.0. The
change-log entry for limma 2.3.6 reads:
6 November 2005: limma 2.3.6
- decideTests() failed with method="nestedF",adjust.method="BH", now fixed
Regards
Gordon
>Any hints will be much appreciated.
>
>Regards,
>
>--
>Adaikalavan Ramasamy ramasamy at cancer.org.uk
>Centre for Statistics in Medicine http://www.csm-oxford.org/
>Wolfson College Annexe http://www.stats.ox.ac.uk/~ramasamy/
>Linton Road Tel : 01865 284 408
>Oxford OX2 6UD Fax : 01865 284 424
More information about the Bioconductor
mailing list