[BioC] edgeR on differentially expressed genes with low read counts

Simon Anders anders at embl.de
Tue Nov 29 18:51:45 CET 2011


Hi Naomi

On 11/29/2011 06:33 PM, Naomi Altman wrote:
> You should filter out the genes with extremely low total counts as you
> will not have enough power to achieve significance.

I think you were to quick here to recommend filtering. Filtering should 
help to increase power, but it should not be necessary to get type-I 
error control.

Lucia's problem is not that she has a lack of power, rather the 
opposite. She is worried that the reported p values are too optimistic, 
and I would agree that the example she quotes looks rather suspicious:

> > [...] When analyzing
>> some border line differentially expressed genes (FDR ~ 0.02) I found
>> that in some cases, the read counts was really low, e.g. only one
>> sample with 2 reads, and the others (7) with 0 counts.
>> >
>> countsTable[rownames(countsTable)=="ENSG00000207696",grep("CT",colnames(countsTable))]
>>
>> 61_CT.poli 61_CT.tot 67_CT.poli 67_CT.tot 70_CT.poli 70_CT.tot
>> 61m2_CT.tot 67m2_CT.tot
>> ENSG00000207696 0 0 0 0 2 0 0 0

Something seems to have gone wrong here.

   Simon



More information about the Bioconductor mailing list