[BioC] Non-specific filtering of Affymetrix Microarray data
Vinay Randhawa [guest]
guest at bioconductor.org
Tue Feb 18 05:07:28 CET 2014
During non-specific filtering, I am using parameters for filtering probes (require.entrez=TRUE, remove.dupEntrez=TRUE,feature.exclude="^AFFX) in addition to the filters of intensity and variance. Independently, both filters works fine, but when I try to use them together, I am getting an error written below:
Error in apply(expr, 1, flist) : dim(X) must have a positive length
Please help me with this.
I have pasted the code below.
#1.Getting the data
source("http://bioconductor.org/biocLite.R")
biocLite("GEOquery")
biocLite("affycoretools")
library(GEOquery)
setwd("/home/vinay/R/R-3.0.2")
getGEOSuppFiles("GSE6631")
setwd("/home/vinay/R/R-3.0.2/GSE6631")
system("tar -xvf GSE6631_RAW.tar")
cels <- list.files( pattern = "[gz]")
sapply(cels, gunzip)
#2.Loading and normalising the data using GC-RMA
# You may need to copy your phenodata.txt file into the GSE6631 folder
library(affy)
library(affycoretools)
data <- ReadAffy()
pData(data)<-read.table("phenodata.txt", header=T,row.names=1, sep="\t")
pData(data)
eset <- gcrma(data)
eset
dim(eset)
pData(eset)
write.exprs(eset, file="Expression_values_GCRMA_normalize.xls")
eset2<-eset[,pData(eset)[,"Condition"]%in%c("Normal","Cancer")]
#3. Non-specific Filtering data
library(genefilter)
celfiles_filtered <- nsFilter(eset2, require.entrez=TRUE, remove.dupEntrez=TRUE,feature.exclude="^AFFX")
f1<-pOverA(0.10,log2(100)) #intensity filter-the intensity of a gene should be above log2(100) in at least 25 percent of the samples
f2<-function(x)(IQR(x)>0.5) #variance filter-the interquartile range of log2âintensities should be at least 0.5
ff<-filterfun(f1,f2)
selected<-genefilter(celfiles_filtered,ff)
-- output of sessionInfo():
R version 3.0.2 (2013-09-25)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_IN LC_NUMERIC=C LC_TIME=en_IN
[4] LC_COLLATE=en_IN LC_MONETARY=en_IN LC_MESSAGES=en_IN
[7] LC_PAPER=en_IN LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_IN LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] hgu95av2.db_2.10.1 org.Hs.eg.db_2.10.1
[3] arrayQualityMetrics_3.18.0 affyPLM_1.38.0
[5] preprocessCore_1.24.0 RColorBrewer_1.0-5
[7] hgu95av2probe_2.13.0 affycoretools_1.34.0
[9] KEGG.db_2.10.1 GO.db_2.10.1
[11] RSQLite_0.11.4 DBI_0.2-7
[13] limma_3.18.12 hgu95av2cdf_2.13.0
[15] AnnotationDbi_1.24.0 simpleaffy_2.38.0
[17] genefilter_1.44.0 gcrma_2.34.0
[19] affy_1.40.0 GEOquery_2.28.0
[21] Biobase_2.22.0 BiocGenerics_0.8.0
[23] BiocInstaller_1.12.0
loaded via a namespace (and not attached):
[1] affyio_1.30.0 annaffy_1.34.0 annotate_1.40.0
[4] AnnotationForge_1.4.4 beadarray_2.12.0 BeadDataPackR_1.14.0
[7] biomaRt_2.18.0 Biostrings_2.30.1 biovizBase_1.10.7
[10] bit_1.1-11 bitops_1.0-6 BSgenome_1.30.0
[13] Cairo_1.5-5 Category_2.28.0 caTools_1.16
[16] cluster_1.14.4 codetools_0.2-8 colorspace_1.2-4
[19] DESeq2_1.2.10 dichromat_2.0-0 digest_0.6.4
[22] edgeR_3.4.2 ff_2.2-12 foreach_1.4.1
[25] Formula_1.1-1 gdata_2.13.2 GenomicFeatures_1.14.2
[28] GenomicRanges_1.14.4 ggbio_1.10.11 ggplot2_0.9.3.1
[31] GOstats_2.28.0 gplots_2.12.1 graph_1.40.1
[34] grid_3.0.2 gridExtra_0.9.1 GSEABase_1.24.0
[37] gtable_0.1.2 gtools_3.3.0 Hmisc_3.14-0
[40] hwriter_1.3 IRanges_1.20.6 iterators_1.0.6
[43] KernSmooth_2.23-10 labeling_0.2 lattice_0.20-24
[46] latticeExtra_0.6-26 locfit_1.5-9.1 MASS_7.3-29
[49] Matrix_1.1-2 munsell_0.4.2 oligoClasses_1.24.0
[52] PFAM.db_2.10.1 plyr_1.8 proto_0.3-10
[55] R2HTML_2.2.1 RBGL_1.38.0 Rcpp_0.11.0
[58] RcppArmadillo_0.4.000.2 RCurl_1.95-4.1 ReportingTools_2.2.0
[61] reshape2_1.2.2 R.methodsS3_1.6.1 R.oo_1.17.0
[64] Rsamtools_1.14.3 rtracklayer_1.22.3 R.utils_1.29.8
[67] scales_0.2.3 setRNG_2011.11-2 splines_3.0.2
[70] stats4_3.0.2 stringr_0.6.2 survival_2.37-7
[73] SVGAnnotation_0.93-1 tcltk_3.0.2 tools_3.0.2
[76] VariantAnnotation_1.8.12 vsn_3.30.0 XML_3.98-1.1
[79] xtable_1.7-1 XVector_0.2.0 zlibbioc_1.8.0
>
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list