[BioC] How to filter genes not expressed in all arrays using limma (single channel agilent microarray data)
Nori [guest]
guest at bioconductor.org
Thu Jun 27 23:51:15 CEST 2013
Hello all,
I have a question regarding filtering genes from single channel agilent microarray data.
Brief description of my data: I initially started with 2 agilent datasets, one scanned with the genepix scanner and the other with an agilent scanner. The data are life stage data, the ones scanned with the genepix scanner are mRNAs collected at adult stages for males and females of 2 drosophila species (I have 3 replicates per sample for a total of 12 arrays for the adult stages). The ones scanned with the agilent scanner are mRNA samples for males and females for the same 2 drosophila species at earlier developmental stages. These samples also have 3 replicates each for a total of 36 arrays for the earlier life stages. I combined the datasets using cbind and then had a total of 48 arrays which I background corrected and normalized together. I then filtered the probes to get rid of control probes and lowly expressed probes. I used the code here similar to the one found in the user guide:
neg95 <- apply(y$E[y$genes$ControlType==-1,],2,function(x) quantile(x,p=0.95))
cutoff <- matrix(1.1*neg95,nrow(y),ncol(y),byrow=TRUE)
isexpr <- rowSums(y$E > cutoff) >= 3
table(isexpr)
How do I filter the genes such that I only have the ones that are expressed in all arrays? The above code doesn't appear to do this.
My full code is here:
> targetAdult<-readTargets("TargetAdult.txt")
> Adult<-read.maimages(targetAdult$FileName,source="genepix",green.only=TRUE)
> targetLarvae<-readTargets("Target_Larvae.txt")
>Larvae<-read.maimages(targetLarvae$FileName,source="agilent",green.only=TRUE)
> combined.all<-cbind(Larvae,Adult)
> bgc.combined<-backgroundCorrect(combined.all,method="normexp")
>norm.combined<-normalizeBetweenArrays(bgc.combined,method="quantile")
> neg95<-apply(norm.combined$E[norm.combined$genes$ControlType==-1,],2,function(x) quantile(x,p=0.95))
> cutoff<-matrix(1.1*neg95,nrow(norm.combined),ncol(norm.combined),byrow=TRUE)
> isexpr<-rowSums(norm.combined$E>cutoff)>=3
> table(isexpr)
isexpr
FALSE TRUE
9219 33600
> y0<-norm.combined[norm.combined$genes$ControlType==0&isexpr,]
> y.ave<-avereps(y0,ID=y0$genes[,"ProbeName"])
Thanks in advance for help with this!
-- output of sessionInfo():
R version 3.0.0 (2013-04-03)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] limma_3.16.4
loaded via a namespace (and not attached):
[1] tools_3.0.0
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list