[BioC] Exclude probes that show sd above 0.1between replicatevalues
J.delasHeras at ed.ac.uk
J.delasHeras at ed.ac.uk
Thu Mar 1 15:19:32 CET 2007
Hi Jan,
not sure if I am understanding.
I am with you about the variability of replicate spots... as long as
they can be measured reliably, as you say. The question I guess is
where do you take these measurements: at the intensity or at the ratio
level? If you're looking at the variability based on ratios (M
values), replicate spots with no signal in one channel tend to have
wildly varying M values (all quite high, in absolute value). Wouldn't
a filtering based solely on variation at M value level discard those
spots? For these kind of spots the M value is irrelevant (I mean, how
much is something divided by *almost* nothing?), we don't really have
a use for the actual number, except for the fact that it should be
large.
As you say, I guess that any analysis depends on what you're after,
but most "general" approaches I see mentioned don't seem to care about
this particular case when signal is missing only in one channel. In
fact, some people just remove any spot where the signal is not
detectable in both channels, which for my purposes would be a disaster
[1]. I have my own approach to deal with this, and I am reasonably
happy, but I am very curious to see how other people approach this
issue.
[1] We had a while ago a demo of teh software Acuity at our centre.
The guy contacted us before asking if we'd have some real data we'd
like to use in teh demo. He chose some of my data, which I thought was
great, as I had already analysed it using my usual tools. His demo
picked up genes I knew to be upregulated... but my "top genes" that
I've continued to use in my experiments were all missing, as they had
been left behind in one of the filtering steps, either the low
intensity filter (applied on *either* channel, or the standard
deviation filter on log2 ratios,, not sure which ones, probably
both)... it took me a while to convince him that I really really
didn't want those spots removed, which surprised me. Is most people
really throwing away these kind of spots?
Jose
Quoting J.Oosting at lumc.nl:
> Hi Jose,
>
> IMHO you should use the variability of replicate spots whenever
> possible. Limma can handle this nicely and for the analysis of
> differential expression I always leave in the replicate spots, and I let
> limma handle them.
>
> For presentation purposes (i.e. heatmaps) it is usually handy to have
> averaged values per gene, and I think that removing genes that cannot be
> measured reliably is a way of improving the visualizations.
>
> Any data-manipulation is context dependent, and especially the effects
> of removing data points should be considered case by case. If you're
> interested in on/off phenomena you should not remove 'empty' spots.
>
> Regards,
>
> Jan
>
>>
>> If you look at the variation on M values alone (it's a
>> MAList), and throw away those with high variation... that
>> sounds like a reasonable thing to do, except that when you
>> have spots with no signal in only one of the channels, the
>> variation is probably quite high too, and you'd remove them.
>> However, they are probably quite an interesting class of
>> spots to keep (genes that become silenced, or activated,
>> after treatment, not merely down/upregulated).
>>
>> I'm mostly studying experiments when I am interested mostly
>> in these cases of activation/silencing, and not so much in
>> up/downregulation alone. I wonder how people account for
>> these situations...
>>
>> Jose
>>
>
>
--
Dr. Jose I. de las Heras Email: J.delasHeras at ed.ac.uk
The Wellcome Trust Centre for Cell Biology Phone: +44 (0)131 6513374
Institute for Cell & Molecular Biology Fax: +44 (0)131 6507360
Swann Building, Mayfield Road
University of Edinburgh
Edinburgh EH9 3JR
UK
More information about the Bioconductor
mailing list