[BioC] edgeR on differentially expressed genes with low read counts
Simon Anders
anders at embl.de
Tue Nov 29 18:51:45 CET 2011
Hi Naomi
On 11/29/2011 06:33 PM, Naomi Altman wrote:
> You should filter out the genes with extremely low total counts as you
> will not have enough power to achieve significance.
I think you were to quick here to recommend filtering. Filtering should
help to increase power, but it should not be necessary to get type-I
error control.
Lucia's problem is not that she has a lack of power, rather the
opposite. She is worried that the reported p values are too optimistic,
and I would agree that the example she quotes looks rather suspicious:
> > [...] When analyzing
>> some border line differentially expressed genes (FDR ~ 0.02) I found
>> that in some cases, the read counts was really low, e.g. only one
>> sample with 2 reads, and the others (7) with 0 counts.
>> >
>> countsTable[rownames(countsTable)=="ENSG00000207696",grep("CT",colnames(countsTable))]
>>
>> 61_CT.poli 61_CT.tot 67_CT.poli 67_CT.tot 70_CT.poli 70_CT.tot
>> 61m2_CT.tot 67m2_CT.tot
>> ENSG00000207696 0 0 0 0 2 0 0 0
Something seems to have gone wrong here.
Simon
More information about the Bioconductor
mailing list