[BioC] re incomplete analysis in Deseq

Wolfgang Huber whuber at embl.de
Wed Mar 7 15:22:28 CET 2012


Simon Anders scripsit 03/06/2012 11:38 PM:
> Hi
>
> On 2012-03-06 22:21, Steve Lianoglou wrote:
>> Couldn't we just-as-well adapt the estimateSizesFactorsForMatrix
>> function to step over the (row,col) bins that have the 0 counts
>> instead of skipping over rows that only have 1 0 element?
>
> This might cause a bit of a bias, as we censor data which might pull
> down data. As a compromise, I am tempted to suggest adding a
> pseudocount, e.g., to add something like 0.1 to all values to avoid the
> zeroes. This also causes some bias but a less severe one. Of course, I
> now pulled the value 0.1 out of thin air, and cannot quite come up with
> a good justification for it.

That reminds me of the development of 'vsn' many years ago, where 
similar considerations suggested that one should do the estimation of 
the normalisation parameters and of the across-array error model (i.e. 
variance stabilising transformation) simultaneously; whereas currently 
in DESeq, this is done as a two-step process, and somewhat more 
heuristically.

This could be a nice master thesis for someone...

Best wishes
	Wolfgang

Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber



More information about the Bioconductor mailing list