[BioC] genefilter vs limma - many probes filtered
Marcin Kaminski [guest]
guest at bioconductor.org
Fri May 23 03:41:24 CEST 2014
Dear list,
I've followed the tips regarding gene filtering at http://www.bioconductor.org/packages/release/bioc/vignettes/genefilter/inst/doc/independent_filtering.pdf when analyzing GEO data (GSE48060). In this case most probes would pass the tests (for adj.p. < .05) if I filter out roughly 70% of them based on variance, which will triple the number of positives compared to not filtering at all. (related graphic: http://i.imgur.com/RuuvRIo.png)
Should I be concerned about such extensive filtering? Does it affect further analysis with limma and introduce bias? If it's a problem, what are the available solutions or diagnostics?
Thanks for your help!
Best regards,
Marcin
-- output of sessionInfo():
R version 3.1.0 (2014-04-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=Polish_Poland.1250 LC_CTYPE=Polish_Poland.1250 LC_MONETARY=Polish_Poland.1250 LC_NUMERIC=C
[5] LC_TIME=Polish_Poland.1250
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] RColorBrewer_1.0-5 hgu133plus2.db_2.14.0 org.Hs.eg.db_2.14.0 RSQLite_0.11.4 DBI_0.2-7 AnnotationDbi_1.26.0
[7] GenomeInfoDb_1.0.2 genefilter_1.46.1 matrixStats_0.8.14 limma_3.20.3 GEOquery_2.30.0 Biobase_2.24.0
[13] BiocGenerics_0.10.0
loaded via a namespace (and not attached):
[1] annotate_1.42.0 IRanges_1.22.6 R.methodsS3_1.6.1 RCurl_1.95-4.1 splines_3.1.0 stats4_3.1.0 survival_2.37-7 tools_3.1.0
[9] XML_3.98-1.1 xtable_1.7-3
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list