[Bioc-sig-seq] RNASeq, differential expression between group, and large variance within groups
Gordon K Smyth
smyth at wehi.EDU.AU
Wed Mar 2 05:18:10 CET 2011
Hi Laurent,
Thanks for the nice summary. Two more points:
1. edgeR will stop reporting tags with extreme variances as outliers if
the user reduces the prior weight, prior.n, given to the common dispersion
(expressed in terms of the number of notional prior tags). Seeing such
tags in the topTags table may prompt the user to do this.
2. It would be very helpful to know whether these high variance tags arise
from (i) technical errors specific to one count, (ii) technical issues
affecting a tag or (iii) genuine biological variation. If (i), then we
could design software to detect outlier counts. If (ii), we could design
software to detect outlier tags. If (iii), then an empirical Bayes
approach to moderating the dispersions, such as is done by edgeR, may be
the best that can be done.
I don't know for sure how to distinguish these causes, but here are some
thoughts. In your original post, you showed a tag with a large count for
library A3 but zeros for all other libraries. Is library A3
systematically different from libraries A1 and A2 for other tags as well
as this one? If this tag is part of co-regulated pathways that are highly
expressed in A3 relative to the others, then likely it is real biological
variation. If A3 differs from A1 and A2 only in a handful of tags with no
biological connection, then perhaps it is a technical issue.
Regards
Gordon
> Date: Tue, 01 Mar 2011 10:25:31 +0100
> From: Laurent Gautier <lgautier at gmail.com>
> To: bioc-sig-sequencing at r-project.org
> Cc: anders at embl.de
> Subject: Re: [Bioc-sig-seq] RNASeq, differential expression between
> group, and large variance within groups
>
> Thanks to Mads, Simon, and Steve.
>
> In summary:
>
> - extreme variance within group (zero or large value) is not a good
> sign, and experimental issues are to be suspected
> - pooling (summing) tags over reference transcripts can rescue some of
> the signal
> - DESeq, and to some extent edgeR, will report as differentially
> expressed such gene/tags with such pathological counts while they should
> not. The issue is acknowledged and care should be taken (here we use
> various visualizations to complement the p-values).
>
> Laurent
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioc-sig-sequencing
mailing list