[Bioc-sig-seq] RNASeq, differential expression between group, and large variance within groups

Gordon K Smyth smyth at wehi.EDU.AU
Wed Mar 2 07:15:53 CET 2011


Hi Laurent,

There's a typo in my email.  I meant to write:

1. edgeR will stop reporting tags with extreme variances as 
*differentially expressed* if the user reduces the prior weight, prior.n 
...

Gordon

On Wed, 2 Mar 2011, Gordon K Smyth wrote:

> Hi Laurent,
>
> Thanks for the nice summary.  Two more points:
>
> 1. edgeR will stop reporting tags with extreme variances as outliers if the 
> user reduces the prior weight, prior.n, given to the common dispersion 
> (expressed in terms of the number of notional prior tags).  Seeing such tags 
> in the topTags table may prompt the user to do this.
>
> 2. It would be very helpful to know whether these high variance tags arise 
> from (i) technical errors specific to one count, (ii) technical issues 
> affecting a tag or (iii) genuine biological variation.  If (i), then we could 
> design software to detect outlier counts.  If (ii), we could design software 
> to detect outlier tags.  If (iii), then an empirical Bayes approach to 
> moderating the dispersions, such as is done by edgeR, may be the best that 
> can be done.
>
> I don't know for sure how to distinguish these causes, but here are some 
> thoughts.  In your original post, you showed a tag with a large count for 
> library A3 but zeros for all other libraries.  Is library A3 systematically 
> different from libraries A1 and A2 for other tags as well as this one?  If 
> this tag is part of co-regulated pathways that are highly expressed in A3 
> relative to the others, then likely it is real biological variation.  If A3 
> differs from A1 and A2 only in a handful of tags with no biological 
> connection, then perhaps it is a technical issue.
>
> Regards
> Gordon
>
>> Date: Tue, 01 Mar 2011 10:25:31 +0100
>> From: Laurent Gautier <lgautier at gmail.com>
>> To: bioc-sig-sequencing at r-project.org
>> Cc: anders at embl.de
>> Subject: Re: [Bioc-sig-seq] RNASeq, differential expression between
>> 	group, and large variance within groups
>> 
>> Thanks to Mads, Simon, and Steve.
>> 
>> In summary:
>> 
>> - extreme variance within group (zero or large value) is not a good
>> sign, and experimental issues are to be suspected
>> - pooling (summing) tags over reference transcripts can rescue some of
>> the signal
>> - DESeq, and to some extent edgeR, will report as differentially
>> expressed such gene/tags with such pathological counts while they should
>> not. The issue is acknowledged and care should be taken (here we use
>> various visualizations to complement the p-values).
>> 
>> Laurent
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioc-sig-sequencing mailing list