[BioC] genefilter vs limma - many probes filtered

Marcin Kaminski [guest] guest at bioconductor.org
Fri May 23 03:41:24 CEST 2014


Dear list,
I've followed the tips regarding gene filtering at http://www.bioconductor.org/packages/release/bioc/vignettes/genefilter/inst/doc/independent_filtering.pdf when analyzing GEO data (GSE48060). In this case most probes would pass the tests (for adj.p. < .05) if I filter out roughly 70% of them based on variance, which will triple the number of positives compared to not filtering at all. (related graphic: http://i.imgur.com/RuuvRIo.png)
Should I be concerned about such extensive filtering? Does it affect further analysis with limma and introduce bias? If it's a problem, what are the available solutions or diagnostics?

Thanks for your help!

Best regards,
Marcin


 -- output of sessionInfo(): 

R version 3.1.0 (2014-04-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=Polish_Poland.1250  LC_CTYPE=Polish_Poland.1250    LC_MONETARY=Polish_Poland.1250 LC_NUMERIC=C                  
[5] LC_TIME=Polish_Poland.1250    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] RColorBrewer_1.0-5    hgu133plus2.db_2.14.0 org.Hs.eg.db_2.14.0   RSQLite_0.11.4        DBI_0.2-7             AnnotationDbi_1.26.0 
 [7] GenomeInfoDb_1.0.2    genefilter_1.46.1     matrixStats_0.8.14    limma_3.20.3          GEOquery_2.30.0       Biobase_2.24.0       
[13] BiocGenerics_0.10.0  

loaded via a namespace (and not attached):
 [1] annotate_1.42.0   IRanges_1.22.6    R.methodsS3_1.6.1 RCurl_1.95-4.1    splines_3.1.0     stats4_3.1.0      survival_2.37-7   tools_3.1.0      
 [9] XML_3.98-1.1      xtable_1.7-3     


--
Sent via the guest posting facility at bioconductor.org.



More information about the Bioconductor mailing list