[BioC] filtering Illumina data

Lana Schaffer schaffer at scripps.edu
Wed Aug 20 22:26:15 CEST 2008


Sean,
I don't think that you understood my filtering procedure.
I filter out probes for which all the arrays have an undetected
call.
My question is really how reliable is the detected pvalue from
Illumina and my chosen 0.05 cutoff.
Lana

-----Original Message-----
From: seandavi at gmail.com [mailto:seandavi at gmail.com] On Behalf Of Sean
Davis
Sent: Wednesday, August 20, 2008 1:20 PM
To: Lana Schaffer
Cc: bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] filtering Illumina data

On Wed, Aug 20, 2008 at 4:09 PM, Lana Schaffer <schaffer at scripps.edu>
wrote:
> Hi,
> I have filtered Illumina data from 46,633 probes to 6537 probes using 
> the Detection Pval.  I used a cutoff of .05 to call detection across 
> all the arrays.
> Can someone tell me if this is reasonable?
> What is a better way of filtering?

I would definitely not use ALL the arrays in your cutoff.  Perhaps
having 10-20% of samples detected for a given probe is more appropriate.
If you force all arrays to meet detection cutoffs, you are excluding
potentially interesting probes that are "on" in some subset, but "off"
in another.  An alternative is to filter by variation (cv, for example).

Sean



More information about the Bioconductor mailing list