[BioC] Breaking the "most genes not differentially expressed" assumption
Paolo Innocenti
paolo.innocenti at ebc.uu.se
Mon Apr 27 18:22:34 CEST 2009
Thanks for the quick reply,
here is my code, run in a fresh session:
require(affy)
miame <- read.MIAME("../stat_input_rawdata/hemiarray.miame")
phenodata <- read.AnnotatedDataFrame(
"../stat_input_rawdata/hemiarray.phenodata",sep=" ")
data <- ReadAffy(
filenames=phenodata$filenames,
sampleNames=sampleNames(phenodata),
phenoData=phenodata,
description=miame,
celfile.path="../stat_input_rawdata")
eset <- rma(data)
require(limma)
f.sex <- factor(pData(data)$sex)
design <- model.matrix(~0 + f.sex)
colnames(design) <- levels(f.sex)
fit <- lmFit(eset,design)
contrast <- makeContrasts(M-F, levels=design)
fit2 <- contrasts.fit(fit,contrast)
fit2 <- eBayes(fit2)
pval <- 0.0001
adjusted <- "fdr"
res <- decideTests(fit2,
adjust=adjusted,
method="separate",
p.value=pval)
My results:
> summary(res)
M - F
-1 11653
0 1442
1 5857
And my sessioInfo():
> sessionInfo()
R version 2.8.0 (2008-10-20)
x86_64-unknown-linux-gnu
locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
attached base packages:
[1] tools stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] affy_1.20.0 Biobase_2.2.1 limma_2.16.3
loaded via a namespace (and not attached):
[1] affyio_1.10.1 preprocessCore_1.4.0
----
Best,
paolo
Sean Davis wrote:
>
>
> On Mon, Apr 27, 2009 at 11:25 AM, Paolo Innocenti
> <paolo.innocenti at ebc.uu.se <mailto:paolo.innocenti at ebc.uu.se>> wrote:
>
> Hi all,
>
> I have dataset of 120 Affy arrays, 60 males and 60 females.
> The expression profiles of the 2 groups differs dramatically, i.e.
> if I run a standard RMA + limma, I have ~90% of the genes
> differentially expressed.
>
>
> You'll probably want to provide the code that you used to perform the
> RMA and limma analyses. It is difficult to comment on the results of an
> analysis without seeing what was done to produce them. Also, be sure to
> provide sessionInfo() in case there are questions about what packages
> are used.
>
> Sean
>
>
>
> Also, downregulated genes are twice as many than upregulated genes,
> although if I impose a cutoff of two-fold difference in expression,
> they are almost equal (15% up and 15% down).
> This is clearly breaking the assumption that most of the genes on
> the array should not be differentially expressed, but the result is
> in line with the current knowledge of sex-biased gene expression in
> my model organism.
>
> I have done some quality control plots, available here:
> - Boxplot:
> http://www.iee.uu.se/zooekol/pdf/hemiarray_qc_boxplot.pdf
>
> - Frequency histogram:
> http://www.iee.uu.se/zooekol/pdf/hemiarray_qc_histogram.pdf
>
> - RLE and NUSE plots:
> http://www.iee.uu.se/zooekol/pdf/hemiarray_qc_RLEandNUSE1.pdf
>
> - CorrelationPlot:
> http://www.iee.uu.se/zooekol/pdf/hemiarray_qc_correlationplot.pdf
>
> - PCA, after RMA normalization:
> http://www.iee.uu.se/zooekol/pdf/hemiarray_qc_pca.pdf
>
> Now, my questions are:
> 1) Is my issue really a issue? If so, how can I perform a robust
> normalization of my arrays?
>
> 2) Is there a tool to assess how "robust" your pre-processing method
> is in respect to this issue?
>
> 3) Sex-biased gene expression is not the only biological question in
> my experiment. Is the massive size of this effect going to affect
> the "detectability" of other smaller effects? (through normalization
> or correction for multiple testing or other?)
>
> Thanks,
> paolo
>
>
> --
> Paolo Innocenti
> Department of Animal Ecology, EBC
> Uppsala University
> Norbyvägen 18D
> 75236 Uppsala, Sweden
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch <mailto:Bioconductor at stat.math.ethz.ch>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
--
Paolo Innocenti
Department of Animal Ecology, EBC
Uppsala University
Norbyvägen 18D
75236 Uppsala, Sweden
More information about the Bioconductor
mailing list