[BioC] Cutoff to use for IQR filtering in genefilter
Mark Cowley
m.cowley0 at gmail.com
Mon Jun 23 01:31:51 CEST 2008
Hi Seungwoo,
The range/IQR/SE/SD of your data is dependent on a number of factors,
including biological variability, and other sources of technical
variability, which can include the type of normalisation algorithm
(think RMA vs MAS5).
Basically, applying a filter on IQR of 0.1 in my study might remove
half the genes, whereas in your study it may remove 10% of them.
Suggestions such as Robert's are useful because they use the IQR of
YOUR data in order to set that cutoff.
I suggest caculating the IQR's for all of your genes, and then either
plotting them plot(density(IQRs)) or just try summary( IQRs ) which
will give you a good feel for just how variable your data is.
If you need help calculating the IQR's and/or variances of your genes,
please post back to the list.
cheers,
Mark
On 22/06/2008, at 9:05 PM, Seungwoo Hwang wrote:
> I am wondering what cutoff value I should use for IQR filtering in
> genefilter. I did some literature search. It varies from paper to
> paper. I have read two papers so far. One used 0.5, the other used
> 0.18. affylmGUI has an option of 0.5, 0.25, and 0.1.
>
> I also searched Bioconductor archive and read that Dr. Robert
> Gentleman suggested to filter out the genes whose IQR below median,
> not for some fixed value.
>
> I have two questions on this vein.
>
> (1) How small is a gene's variance (in terms of number) if its IQR
> is some value, say, 0.5 or 0.1? Can I calculate it?
> (2) When median is used instead of fixed number, wouldn't it be too
> large, since median of a gene's expression intensities across
> samples can be anything?
>
> Thanks,
>
> Seungwoo
> ------------------------------------
> Seungwoo Hwang, Ph.D.
> Senior Research Scientist
> Korean Bioinformation Center
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
----------------------------------------------------------------------
Mark Cowley, BSc (Bioinformatics)(Hons)
Peter Wills Bioinformatics Centre
Garvan Institute of Medical Research
More information about the Bioconductor
mailing list