[BioC] [Bioc] RNAseq less sensitive than microarrays? Is it a statistical issue?
Simon Anders
anders at embl.de
Thu May 23 09:14:11 CEST 2013
Hi Lucia
On 21/05/13 23:19, Lucia Peixoto wrote:
> In terms of my count table, the counts are generated by RUM as described
> in the paper, I just parse them out into a table format using custom
> scripts. I do not have much experience with this, but in principle I do
> not see anything wrong with the way the counts are generated, neither do
> the developers which are down the hall. It produces very high
> correlation to Ct values by qPCR after logRPKM transformation based on
> the data I have seen from them.
I did not know that RUM produces a count table, but if this works well,
it sounds like a useful feature. However, as I pointed out in my reply
to Thomas, counting for expression strength estimation and counting for
differential testing are slightly different tasks, and it would be
important to know what the RUM developers aimed for when they decided on
a strategy to deal with ambiguously mapped reads. If you ask them, let
us know what they say, please.
To get back to the original issue: The sensitivity of RNA-Seq obviously
depends on read depth. If you have few reads, Poisson noise becomes
strong, and such an RNA-Seq experiment will be less precise than a
microarray assay, especially if overall expression strength of genes is
all you care about.
The break-even point where RNA-Seq becomes better than microarrays is
probably somewhere at 1 to 10 million reads. (This is a rough guess, the
papers mentioned in the thread may contain proper figures.)
You seem to have more than that. But you also mentioned that you have an
unusually high number of PCR duplicates, which means that the
information content of your reads is reduced. Also, check the column
sums of your count table: Which fraction of your reads actually gets
assigned to a gene? It's rather common for things to go wrong here.
Simon
More information about the Bioconductor
mailing list