[BioC] Filtering before differential analysis
drnevich at illinois.edu
Mon Jan 19 16:49:29 CET 2009
I'll answer both your e-mails here and post it back to the list, so
the answers can become part of the archives.
At 07:34 PM 1/17/2009, Sally wrote:
>Do you mean I should not have a myfun script? Should I be giving
>weights to spots at all? I'm abit confused as to what I should do.
That depends on what the flags are in your data files...
>Thanks for the reply. I used Imagene to scan my slides. Imagene is
>fairly primative. It only has 3 flags: 0, 1, 2. 0 is a 'good'
>spot, 1 is a spot which was marked as no good by the user during
>gridding, and 2 is a bad spot. Are you saying I should not 'filter
>out' these spots (before computing DE genes) using the script
>myfun <- function(x) as.numeric ( x$Flag >0)?
>When you say Gordon says not to flag out [?filter out] spots (in my
>case with Imagene) using
>myfun <- function(x) as.numeric ( x$Flag >0).
>How should I re-write this script to include all spots?
You should filter out spots that the user marked as no good during
the gridding (dust spots, scratches, etc.), but not the spots the
program automatically marked as bad (usually not above background).
So if you want to give a weight of 1 to all the spots that have flags
not equal to 1, then your function is:
myfun <- function(x) as.numeric(x$Flag != 1)
>----- Original Message ----- From: "Jenny Drnevich" <drnevich at illinois.edu>
>To: "Sally" <sagoldes at shaw.ca>; <bioconductor at stat.math.ethz.ch>
>Cc: "Sally" <sagoldes at shaw.ca>
>Sent: Friday, January 16, 2009 7:55 AM
>Subject: Re: [BioC] Filtering before differential analysis
>>Your script is transferring the flags to weights, and in your
>>script, only ESTs with a flag of 0 get a weight of 1, and all other
>>spots get a weight of 0, which means they are not used at all in
>>the analysis. So yes, you are in effect "filtering" out these
>>individual spots by setting the weights to 0, which is exactly what
>>Gordon said you should not do. I second this opinion for the
>>following reason: a "bad" spot is a spot where you had no
>>information whatsoever on what the expression level might have
>>been, so the number that you get (because you always get a number)
>>has no relationship at all to what the real value was and so you
>>should throw it out by giving it a weight of 0. However, if you
>>don't measure anything above background for a particular spot
>>(which GenePix will flag -50), it's not a "bad" spot, because you
>>do have useful information that the expression level is below
>>detection, and the number that you get will be relatively valid
>>compared to other spots that had detectable expression. Would you
>>throw out values of 0 if you got them in any other scientific
>>measurement? Likely not, so why throw them out here?
>>At 11:30 AM 1/15/2009, Sally wrote:
>>>Is flagging the same as filtering? In my Limma script it takes
>>>only those ESTs with a flag of 0 (which are good spots).
>>>myfun <- function(x) as.numeric ( x$Flag >0)
>>> Is this not the same as filtering? If I actually remove the
>>> absent spots from my Imagene files, then the files each have
>>> different lengths and the order of genes is not the same in each file.
>>> [[alternative HTML version deleted]]
>>>Bioconductor mailing list
>>>Bioconductor at stat.math.ethz.ch
>>>Search the archives:
>>Jenny Drnevich, Ph.D.
>>Functional Genomics Bioinformatics Specialist
>>W.M. Keck Center for Comparative and Functional Genomics
>>Roy J. Carver Biotechnology Center
>>University of Illinois, Urbana-Champaign
>>1201 W. Gregory Dr.
>>Urbana, IL 61801
>>e-mail: drnevich at illinois.edu
Jenny Drnevich, Ph.D.
Functional Genomics Bioinformatics Specialist
W.M. Keck Center for Comparative and Functional Genomics
Roy J. Carver Biotechnology Center
University of Illinois
1201 W. Gregory Dr.
Urbana, IL 61801
Email: drnevich at uiuc.edu
More information about the Bioconductor