[BioC] Harsh results using limma!

Fri Aug 13 15:38:54 CEST 2004

The OTHER explanation could be technician error or inadvertent
cross-hybridization due to processing; the catch is that you've also
got a batch effect, it seems.

The more critical issue is that you've got to use statistics that
describe what you want.  Obviously, those that standardize location
(mean/median) by some variability (std error/interquartile range) are
going to give you p-values (non-sig) which reflect that
standardization.

If you want a "consistency" / magnitude statistic (say, a sign test
augmented in some manner with the magnitude), at this point you'd have
to be creative.  But having been creative, you still could get a
distribution via simulation or resampling to work from to obtain
p-values.

The only problem will be trying to convince reviewers (or folks
playing devil's advocate) that your "statistic" is reasonable for
differential expression.

best,
-tony

"michael watson (IAH-C)" <michael.watson at bbsrc.ac.uk> writes:

> Hi Gordon
>
> Yes you're right.  I didn't really mean to compare limma to a t-test.
> It's just that the results are very consistent within technical
> replicates (the dye-swaps), just not consistent between biological
> replicates.  But this is the situation we expect - technical replicates
> highly correlated and biological replicates much less so.  Clearly
> differences of 0.2 could be noise, but my due-swaps BOTH came up with
> 0.2.  If I had ten replicate dye-swaps, all with 0.2 as the log(ratio)
> would we still call this noise?   Given that the other replicate
> experiments were also highly reproducible, I can't help but think this
> gene is differentially expressed.
>
> I know why limma and t-test disregard this gene, I just still think it
> is a little harsh and that I am "throwing the baby away with the
> bathwater", as it were.  
>
> Mick
>
> -----Original Message-----
> From: Gordon Smyth [mailto:smyth at wehi.edu.au] 
> Sent: 13 August 2004 12:56
> To: michael watson (IAH-C)
> Cc: bioconductor at stat.math.ethz.ch
> Subject: Re: [BioC] Harsh results using limma!
>
>
> At 09:14 PM 13/08/2004, michael watson (IAH-C) wrote:
>>Hi
>>
>>Firstly, I think limma is excellent and use it a lot, but some recent 
>>results are a bit, erm, disappointing and I wondered if someone could 
>>explain them.
>>
>>Basic set up was a double dye-swap experiment (4 arrays) involving 
>>different animals, one infected with one type of bacterium and the 
>>other a different bacterium, compared to one another directly.  I used 
>>limma to analyse this and got a list of genes differentially regulated 
>>- great!
>>
>>THEN another replicate experiment was performed (so now I have 6 
>>arrays, 3 dye-swaps), and I re-did the analysis and my set of genes was
>
>>completely different - but that's fine, we can put that down to 
>>biological variation.  We know limma likes genes which show consistent 
>>results across arrays, and when I looked at my data, I found that the 
>>genes in my original list were not consistent across all six arrays.  
>>So I am reasonably happy about this.
>>
>>My question comes from looking at the top gene from my old list in the 
>>context of all six arrays.  Here are the normalised log ratios across 
>>all six arrays (ds indicates the dye-swap):
>>
>>Gene1
>>Exp1            -5.27
>>Exp1ds  6.29
>>Exp2            -4.61
>>Exp2ds  5.54
>>Exp3            -0.2
>>Exp3ds  0.2
>
> Changes of +-0.2 are tiny and look like pure noise. So, you can have a
> gene 
> for which only 2/3 of your mice show a difference. Statistical methods 
> based on means and standard deviations will always judge this situation 
> harshly. If you try an ordinary t-test rather than the limma method,
> you'll 
> find that this gene would be judged much more harshly again.
>
> Gordon
>
>>Not suprisingly, limma put this as the top gene when looking at the 
>>first four arrays.  However, when looking across all six arrays, limma 
>>places it at 230 in the list with a p-value of 0.11 (previously the 
>>p-value was 0.0004).
>>
>>So finally we get to my point/question - does this gene really 
>>"deserve" a p-value of 0.11 (ie not significant)?  In every case the 
>>dye-flips are the correct way round, it is only the magnitude of the 
>>log(ratio) which differs - and as we are talking about BIOLOGICAL 
>>variation here, don't we expect the magnitude to change?  If we are 
>>taking into account biological variation, surely we can't realistically
> expect consistent
>>ratios across all replicate experiments??   Isn't limma being a little
>>harsh here?  After all the average log ratio is -3.7 (taking into 
>>account the dye-flips) - and to me, experiment 3's results still 
>>support the idea of the gene being differentially expressed, and are 
>>even consistent within that biological replicate.
>>
>>Clearly I am looking at this data from a biologists point of view and 
>>not a statisticians.  But we are studying biology, not statistics, and 
>>I can't help feel I am missing out on something important here if I 
>>disregard this gene as not significantly differentially expressed (NB 
>>this is just the first example, there are many others).
>>
>>I should also add that there appears nothing strange about the arrays 
>>for Experiment 3 - the distribution of log(ratio) for those arrays is 
>>pretty much the same as the other four, so this is not an array-effect,
>
>>it is an effect due to natural biological variation.
>>
>>Comments, questions, criticisms all welcome :-)
>>
>>Mick
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
>

-- 
Anthony Rossini			    Research Associate Professor
rossini at u.washington.edu            http://www.analytics.washington.edu/ 
Biomedical and Health Informatics   University of Washington
Biostatistics, SCHARP/HVTN          Fred Hutchinson Cancer Research Center
UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable
FHCRC  (M/W): 206-667-7025 FAX=206-667-4812 | use Email

CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}