[BioC] Agi4x44Preprocess/Limma and number of significant p-values
Paulo Nuin
nuin at genedrift.org
Mon Oct 31 18:45:54 CET 2011
Hi everyone
I have used Agi4x44Preprocess (script below) to analyze three mouse (mgug4122a) Agilent microarray sets and I want to check if the methods I'm using make sense, as the number of significant p-values I'm getting seems quite high. I'm using Limma for the significance analysis. the lowest value I'm getting is 1.09667003222346e-12, while the adjusted p-value 2.43208513046197e-08. If I consider significant p-value below 0.01, I get 9721 significant ones, roughly half of the gene list. This was for one of the datasets, with 2 (four result sets) micorarrays, 1 for each treatment. For another similar dataset, 4 microarrays with 2 samples for each treatment, results are very similar.
I also tried the approach in this page http://matticklab.com/index.php?title=Single_channel_analysis_of_Agilent_microarray_data_with_Limma and the results seem better (I mean less p-values below 0.01).
I was wondering what would be the suggested approach and if there's any reason that I'm getting so many values below 0.01, with some extreme values at 10-12 range. Is there a bug on Agi4x44Preprocess? It seism to get the same columns as the approach on the MAttick's lab webpage.
Thanks in advance for any help
Paulo
library(Agi4x44PreProcess)
library(mgug4122a.db)
targets=read.targets(infile="targets.txt")
dd=read.AgilentFE(targets, makePLOT=FALSE)
CV.rep.probes(dd, "mgug4122a.db", foreground = "MeanSignal", raw.data = TRUE, writeR = T, targets)
genes.rpt.agi(dd, "mgug4122a.db", raw.data = TRUE, WRITE.html = T, REPORT = T)
ddNORM = BGandNorm(dd, BGmethod = "half", NORMmethod = "quantile", foreground = "MeanSignal", background = "BGMedianSignal", offset = 50, makePLOTpre = FALSE, makePLOTpost = F)
ddFILT = filter.probes(ddNORM, control = TRUE, wellaboveBG = TRUE,
isfound = TRUE, wellaboveNEG = TRUE, sat = TRUE, PopnOL = TRUE,
NonUnifOL = T, nas = TRUE, limWellAbove = 75, limISF = 75,
limNEG = 75, limSAT = 75, limPopnOL = 75, limNonUnifOL = 75,
limNAS = 100, makePLOT = F, annotation.package = "mgug4122a.db",
flag.counts = T, targets)
ddPROC = summarize.probe(ddFILT, makePLOT = F, targets)
esetPROC = build.eset(ddPROC, targets, makePLOT = F, annotation.package = "mgug4122a.db")
write.eset(esetPROC,ddPROC,"mgug4122a.db",targets)
levels.treatment = levels(factor(targets$Treatment))
treatment = factor(targets$Treatment, levels = levels.treatment)
design = model.matrix(~0 + treatment)
print(design)
colnames(design) = c("EV50", "SHGR19")
fit = lmFit(esetPROC, design)
CM = makeContrasts(EV50vsSHGR19=EV50-SHGR19, levels=design )
fit2 = contrasts.fit(fit, CM)
fit2 = eBayes(fit2)
my.lfc <- 0
output <- topTable(fit2, coef=1, adjust.method="BH", lfc=my.lfc, number = 20000)
write.table(output, file="output.txt", sep="\t", quote=FALSE)
More information about the Bioconductor
mailing list