[BioC] dexseq fitDispersionFunction(ecs) error

Wed Apr 23 05:04:31 CEST 2014

Thank you so much for your constant support! That' s good to know! I already tried that and yes, it did work. I really appreciate your kind help!

Thanks,
Xiayu

-----Original Message-----
From: Alejandro Reyes [mailto:alejandro.reyes at embl.de] 
Sent: Tuesday, April 22, 2014 11:33 AM
To: Rao,Xiayu; 'bioconductor at r-project.org'
Subject: Re: dexseq fitDispersionFunction(ecs) error

Dear Xiayu,

Thanks for your report, we are looking into that right now, it has to do with versions of dependencies outside bioconductor.

As a temporary solution, you can install an old version of statmod compatible with your version of DEXSeq by doing in your R session:

library(devtools);install_url("http://cran.r-project.org/src/contrib/Archive/statmod/statmod_1.4.18.tar.gz")

Your code below should work then!

Best regards,
Alejandro

> Hello, Alejandro
>
> Thank you very much for your previous guidance. I did what you suggested, merging the count data for all samples to generate a matrix and used this following code to get ecs.
>> ecs <- newExonCountSet(countData=countdata, design=sampleTable, 
>> geneIDs=geneid, exonIDs=exonid)
> The estimateDispersions(ecs) step was "Done", but the problem came as the following. Could you please give me some advice on dealing with this error ??
> .............................................
> Done
>> ecs <- fitDispersionFunction(ecs)
> Error in fitDispersionFunction(ecs) :
>    no CR dispersion estimations found, please first call 
> estimateDispersions function
>
> I found a way from the forum to check. It seems that the dispersion estimates are all NAs, which could also be seen from fData(ecs).
>> sum(!is.na(fData(ecs)$dispBeforeSharing)) / nrow(fData(ecs))
> [1] 0
>
>
>> head(countdata)
>                       SRR791052 SRR791054 SRR791056 SRR791057 SRR791058
> ENSG00000000003:0001         0         0         0         0         0
> ENSG00000000003:0002        75       120        67       108        93
> ENSG00000000003:0003         0         0         0         0         0
> ENSG00000000003:0004        78       126        59        58        74
> ENSG00000000003:0005         0         0         0         0         0
> ENSG00000000003:0006        67       114        37        47        54
>                       SRR791059 SRR791061 SRR791062 SRR791066 SRR791067
> ENSG00000000003:0001         0         0         0         0         2
> ENSG00000000003:0002       126        85       108        43        41
> ENSG00000000003:0003         0         1         0         0         0
> ENSG00000000003:0004       131        67        98        37        14
> ENSG00000000003:0005         0         0         0         0         0
> ENSG00000000003:0006        94        56        80        37        26
>                       SRR791068 SRR791069 SRR791070 SRR791071 SRR791072
> ENSG00000000003:0001         0         0         0         0         0
> ENSG00000000003:0002        34        64        55        62        64
> ENSG00000000003:0003         1         0         0         0         0
> ENSG00000000003:0004        22        26        32        54        47
> ENSG00000000003:0005         0         0         0         0         0
> ENSG00000000003:0006        33        36        42        60        40
>                       SRR791073
> ENSG00000000003:0001         0
> ENSG00000000003:0002        50
> ENSG00000000003:0003         0
> ENSG00000000003:0004        18
> ENSG00000000003:0005         1
> ENSG00000000003:0006        29
>
>> head(geneid)
> [1] "ENSG00000000003" "ENSG00000000003" "ENSG00000000003" "ENSG00000000003"
> [5] "ENSG00000000003" "ENSG00000000003"
>
>> head(exonid)
> [1] "0001" "0002" "0003" "0004" "0005" "0006"
>
>> sampleTable
>                           condition
> SRR791052     HER2+
> SRR791054     HER2+
> SRR791056     HER2+
> SRR791057     HER2+
> SRR791058     HER2+
> SRR791059     HER2+
> SRR791061     HER2+
> SRR791062     HER2+
> SRR791066    Benign
> SRR791067    Benign
> SRR791068    Benign
> SRR791069    Benign
> SRR791070    Benign
> SRR791071    Benign
> SRR791072    Benign
> SRR791073    Benign
>
> Note: I am using R 3.0.3 and DEXSeq 1.8.0.
>
> Thanks,
> Xiayu
>
>
>
> -----Original Message-----
> From: Rao,Xiayu
> Sent: Wednesday, April 16, 2014 9:18 AM
> To: 'Alejandro Reyes'; bioconductor at r-project.org
> Subject: RE: dexseq read.HTSeqCounts error
>
> Thank you, Alejandro!  That sounds to be a good way to go. I will do what you suggested.
>
> Thanks,
> Xiayu
>
> -----Original Message-----
> From: Alejandro Reyes [mailto:alejandro.reyes at embl.de]
> Sent: Wednesday, April 16, 2014 2:36 AM
> To: Rao,Xiayu; bioconductor at r-project.org
> Subject: Re: dexseq read.HTSeqCounts error
>
> Hi Xiayu,
>
> Please keep your answers in the mailing list, so it can be useful for users encountering similar issues.
>
> Yes, that is the problem, the function is expecting the same number of lines for each file (it is expecting the users to use the same annotation file or the same features as counting bins for all their samples).  What you could do is read independently your files within R, generate a matrix of the union of all possible counting bins (the first column of your files) times samples (columns), and then fill them in with the information from your files, leaving a 0 on the corresponding missing values. Then use the function newExonCountSet to generate your ecs file.
>
> Best regards,
> Alejandro
>
>
>
>
>
>> Hi, Alejandro
>>
>> Thank you for your prompt response. Below is the output. I think my files are OK, but they have different number of lines. So you think that is the issue?? I am not counting reads that locate in exonic regions, so I do not have a .gtf file. I am counting the reads within some splice sites. Do you have any suggestions for me in my case?
>>
>>> all( file.exists(as.character(sampleTable$countfiles)) )
>> [1] TRUE
>>
>> -sh-4.1$ wc -l *count.txt
>>      222604 SRR791043_count.txt
>>      215908 SRR791044_count.txt
>>      232519 SRR791045_count.txt
>>      227275 SRR791046_count.txt
>>      241367 SRR791047_count.txt
>>      253112 SRR791048_count.txt
>>      234680 SRR791049_count.txt
>>      199632 SRR791050_count.txt
>>      208880 SRR791051_count.txt
>>      195265 SRR791052_count.txt
>>      .........
>>
>> Thanks,
>> Xiayu
>>
>>
>> -----Original Message-----
>> From: Alejandro Reyes [mailto:alejandro.reyes at embl.de]
>> Sent: Tuesday, April 15, 2014 11:44 AM
>> To: Rao,Xiayu; 'bioconductor at r-project.org'
>> Subject: Re: dexseq read.HTSeqCounts error
>>
>> Dear Xiayu Rao,
>>
>> What is the output of:
>>
>> all( file.exists(as.character(sampleTable$countfiles)) )
>>
>> ?
>>
>> Maybe one of your files is corrupted? Do they have all the same number of lines?
>>
>> Best regards,
>> Alejandro
>>
>>
>>> Hi, Alejandro and others
>>>
>>> Thank you for your information on the forum, which is very helpful. I now have a problem in running dexseq. I need your advice to solve the problem. Any input would be very appreciated.
>>>
>>>> ecs <-
>>>> read.HTSeqCounts(countfiles=as.character(sampleTable$countFile),des
>>>> i
>>>> g
>>>> n=sampleTable, flattenedfile=NULL)
>>> Error in `rownames<-`(`*tmp*`, value = c("ENSG00000000003:001", "ENSG00000000003:002",  :
>>>      attempt to set 'rownames' on an object with no dimensions
>>>
>>> I then checked:
>>>> dim(lf)
>>> NULL
>>>> dim(dcounts)
>>> NULL
>>>
>>> I generated count files by myself instead of using dexseq_count.py. I tried to follow the standard count file format:
>>> -sh-4.1$ more SRR791043_count.txt
>>> ENSG00000000003:001     284
>>> ENSG00000000003:002     179
>>> ENSG00000000003:003     275
>>> ENSG00000000003:004     156
>>> ENSG00000000003:005     177
>>> ENSG00000000003:006     157
>>> ENSG00000000003:007     9
>>> ENSG00000000003:008     45
>>> ENSG00000000003:009     4
>>> ENSG00000000003+ENSG00000102362:001     3
>>> ............
>>> Intergene1:001       1
>>> .........
>>> intergene2097:001       11
>>> intergene2098:001       99
>>> intergene2099:001       54
>>> intergene2100:001       3
>>> intergene2101:001       3
>>> intergene2102:001       20
>>> intergene2103:001       1
>>> intergene2104:001       2
>>> intergene2105:001       1
>>> intergene2106:001       10
>>> _ambiguous      0
>>> _empty  0
>>> _lowaqual       0
>>> _notaligned     0
>>>
>>>
>>>> sampleTable
>>>                                                  countFile condition
>>> SRR791052 SRR791052_count.txt     HER2+
>>> SRR791054 SRR791054_count.txt     HER2+
>>> SRR791056 SRR791056_count.txt     HER2+
>>> SRR791057 SRR791057_count.txt     HER2+
>>> SRR791058 SRR791058_count.txt     HER2+
>>> SRR791059 SRR791059_count.txt     HER2+
>>> SRR791061 SRR791061_count.txt     HER2+
>>> SRR791062 SRR791062_count.txt     HER2+
>>> SRR791066 SRR791066_count.txt    Benign
>>> SRR791067 SRR791067_count.txt    Benign
>>> SRR791068 SRR791068_count.txt    Benign
>>> SRR791069 SRR791069_count.txt    Benign
>>> SRR791070 SRR791070_count.txt    Benign
>>> SRR791071 SRR791071_count.txt    Benign
>>> SRR791072 SRR791072_count.txt    Benign
>>> SRR791073 SRR791073_count.txt    Benign
>>>
>>> Note: I am using R 3.0.3 and DEXSeq 1.8.0
>>>
>>>
>>> Thanks,
>>> Xiayu
>>>