[BioC] re incomplete analysis in Deseq

Kasper Daniel Hansen kasperdanielhansen at gmail.com
Tue Mar 6 22:01:39 CET 2012


On Tue, Mar 6, 2012 at 3:28 PM, Wolfgang Huber <whuber at embl.de> wrote:
>
> Hi Julian, Kasper
>
> in this case you will need to (and as I am sure, be able to) come up with
> another method for estimating size factors that works on these data. Why not
> try with the 'colSums'?

Hey, I am just reporting.

It might be worthwhile adding some check in
estimateSizeFactorsForMatrix to at least warn about the case where the
number of non-infinity rows is small, say less than 100.  This could
be an issue with data types where the number of features is small, eg.
small RNAs or other types.  I am sure people are using DESeq for these
types as well.

Kasper

>
>        Best wishes
>        Wolfgang
>
> Mar/6/12 3:25 PM, Kasper Daniel Hansen scripsit::
>
>> I got the data from Julian.  It is not a standard RNA-seq experiment.
>> He only has 150 "genes" (rows), but 76 samples (columns).  There are
>> no NA's.  The problem arises from the fact that
>> estimateSizeFactorsForMatrix only uses rows (genes) where all samples
>> have counts>  0.  In Julian's case, all rows have at least one column
>> with zero counts, implying that loggeomeans is -Inf for all rows.
>>
>> Kasper
>>
>> On Mon, Mar 5, 2012 at 10:59 AM, Wolfgang Huber<whuber at embl.de>  wrote:
>>>
>>> Dear Julian
>>>
>>> do your data contain NA?
>>> What is the output of sessionInfo()?
>>>
>>> Can you provide a code example (incl. input data, potentially subset)
>>> that
>>> reproduces your error?
>>>
>>>        Best wishes
>>>        Wolfgang
>>>
>>>
>>> Mar/5/12 11:50 AM, Julian [guest] scripsit::
>>>
>>>>
>>>> I'm using deseq with 454 data and it worked for one set of data but the
>>>> same script is failing me the second time around with a different set of
>>>> experimental data.
>>>>
>>>> The input data is a matrix of counts of 454 seqs per sample, I have 36
>>>> pre
>>>> and 36 post samples.
>>>>
>>>> When I run the estmateSizeFactors I get all my samples as NA.
>>>>
>>>> Any ideas why?
>>>>
>>>>  -- output of sessionInfo():
>>>>
>>>>
>>>>> cds<- estimateSizeFactors(cds)                               #
>>>>> Estimates
>>>>> size factors based on the count data
>>>>> sizeFactors( cds )
>>>>
>>>>
>>>>   X01_MA_1  X02_MA_10 X03_MA_100 X04_MA_102  X05_MA_11  X06_MA_13
>>>>  X07_MA_14  X08_MA_15  X09_MA_17  X10_MA_18  X11_MA_19
>>>>         NA         NA         NA         NA         NA         NA
>>>> NA         NA         NA         NA         NA
>>>>   X12_MA_2  X13_MA_20  X14_MA_22  X15_MA_23  X16_MA_24  X17_MA_25
>>>> X18_MA_4  X19_MA_47   X20_MA_5  X21_MA_69   X22_MA_7
>>>>         NA         NA         NA         NA         NA         NA
>>>> NA         NA         NA         NA         NA
>>>>  X23_MA_71  X24_MA_73  X25_MA_75  X26_MA_77  X27_MA_79   X28_MA_8
>>>>  X29_MA_81  X30_MA_83  X31_MA_86  X32_MA_88   X33_MA_9
>>>>         NA         NA         NA         NA         NA         NA
>>>> NA         NA         NA         NA         NA
>>>>  X34_MA_90  X35_MA_92  X36_MA_94  X37_MA_96  X38_MA_98 X39_MA_101
>>>> X40_MA_103  X41_MA_26  X42_MA_27  X43_MA_29  X44_MA_30
>>>>         NA         NA         NA         NA         NA         NA
>>>> NA         NA         NA         NA         NA
>>>>  X45_MA_31  X46_MA_33  X47_MA_34  X48_MA_36  X49_MA_37  X50_MA_40
>>>>  X51_MA_41  X52_MA_42  X53_MA_43  X54_MA_44  X55_MA_45
>>>>         NA         NA         NA         NA         NA         NA
>>>> NA         NA         NA         NA         NA
>>>>  X56_MA_46  X57_MA_49  X58_MA_50  X59_MA_52  X60_MA_54  X61_MA_55
>>>>  X62_MA_70  X63_MA_72  X64_MA_74  X65_MA_76  X66_MA_78
>>>>         NA         NA         NA         NA         NA         NA
>>>> NA         NA         NA         NA         NA
>>>>  X67_MA_80  X68_MA_82  X69_MA_84  X70_MA_87  X71_MA_89  X72_MA_91
>>>>  X73_MA_93  X74_MA_95  X75_MA_97  X76_MA_99
>>>>         NA         NA         NA         NA         NA         NA
>>>> NA         NA         NA         NA
>>>>
>>>>
>>>> --
>>>> Sent via the guest posting facility at bioconductor.org.
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>>
>>>
>>>
>>> --
>>> Best wishes
>>>        Wolfgang
>>>
>>> Wolfgang Huber
>>> EMBL
>>> http://www.embl.de/research/units/genome_biology/huber
>>>
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
> --
> Best wishes
>        Wolfgang
>
> Wolfgang Huber
> EMBL
> http://www.embl.de/research/units/genome_biology/huber
>
>



More information about the Bioconductor mailing list