[BioC] [Bioc] RNAseq less sensitive than microarrays? Is it a statistical issue?

Thu May 23 09:14:11 CEST 2013

Hi Lucia

On 21/05/13 23:19, Lucia Peixoto wrote:
> In terms of my count table, the counts are generated by RUM as described
> in the paper, I just parse them out into a table format using custom
> scripts. I do not have much experience with this, but in principle I do
> not see anything wrong with the way the counts are generated, neither do
> the developers which are down the hall. It produces very high
> correlation to Ct values by qPCR after logRPKM transformation based on
> the data I have seen from them.

I did not know that RUM produces a count table, but if this works well, 
it sounds like a useful feature. However, as I pointed out in my reply 
to Thomas, counting for expression strength estimation and counting for 
differential testing are slightly different tasks, and it would be 
important to know what the RUM developers aimed for when they decided on 
a strategy to deal with ambiguously mapped reads. If you ask them, let 
us know what they say, please.

To get back to the original issue: The sensitivity of RNA-Seq obviously 
depends on read depth. If you have few reads, Poisson noise becomes 
strong, and such an RNA-Seq experiment will be less precise than a 
microarray assay, especially if overall expression strength of genes is 
all you care about.

The break-even point where RNA-Seq becomes better than microarrays is 
probably somewhere at 1 to 10 million reads. (This is a rough guess, the 
papers mentioned in the thread may contain proper figures.)

You seem to have more than that. But you also mentioned that you have an 
unusually high number of PCR duplicates, which means that the 
information content of your reads is reduced. Also, check the column 
sums of your count table: Which fraction of your reads actually gets 
assigned to a gene? It's rather common for things to go wrong here.

   Simon