[BioC] PreFiltering probe in microarray analysis
Yuan Hao
yuan.x.hao at gmail.com
Wed Jun 1 15:31:48 CEST 2011
Hi Stephanie,
You can have a look the 'genefilter' package in R/bioconductor.
Basically, it's easy to set up a overall variance filter, for example
if you have a data set normalized by gcrma and you require all
probesets having an IQR bigger than 0.5, you can do:
> library(affy)
> library(genefilter)
> library(gcrma)
> eset <- gcrma(data)
> f <- function(x)(IQR(x)>0.5)
> selected <- genefilter (eset, f)
> eset.filtered <- eset[selected, ]
You may have to be careful about the filtering on your data. It quiet
depends on the characters of your data. There is a paper[1] having had
a very good review about this, which doesn't really recommend an
overall variance filtering combined with Limma.
Cheers,
Yuan
[1] R. Bourgon, R. Gentleman and W. Huber. PNAS 2010. p9546-9551
On 1 Jun 2011, at 13:58, Stephanie PIERSON wrote:
> Hello everybody,
>
> I am a french student in bioinformatic. I have to analyze microarray
> data and I have some questions about prefiltering genes.
> The dataset that I have to analyze consist in 8 microarray, i have 4
> times points and 2 replicats for each time point. Agilent's two
> color microarray (Whole Mouse Genome (4x44K) Oligo Microarrays)
> were used for the analysis. We are searching for genes that are
> differentially expressed between two conditions (for example C1 and
> C2) at the different time points and genes that are differentially
> expressed in one condition (C1 or C2) over time .
> I have chosen LIMMA to perform the statistical analysis because I
> read in papers (Jeanmougin et al. PLoS ONE, Jefferey and al. BMC
> bioinformatic 2006,7/359 ) that it work better in experiment with
> few replicate per conditions.
> I perfom the statistical analysis on the whole data set ( more than
> 37 000 genes ), but I have high corrected p value after multiple
> testing correction (benjamini hochberg ). I would like to prefilter
> genes before statistical analysis, but I don't know how to do this.
> I read in Bourgon's paper that we can filter on the overall variance
> or on the overall mean, but in my case, with few replicates, how can
> I do ? In more, in this paper, it is not recommended to combine
> limma with a filtering procedure ...
> Someone can help me please ?
>
> Thank you,
> Best wishes
> Stéphanie
>
>
>
> --
> Stéphanie PIERSON
> Universite de la Mediterranee (Aix-Marseille II)
> Master 2 Pro Bioinformatique et Génomique
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list