[BioC] dexseq read.HTSeqCounts error

Rao,Xiayu XRao at mdanderson.org
Wed Apr 16 16:18:25 CEST 2014


Thank you, Alejandro!  That sounds to be a good way to go. I will do what you suggested.  

Thanks,
Xiayu

-----Original Message-----
From: Alejandro Reyes [mailto:alejandro.reyes at embl.de] 
Sent: Wednesday, April 16, 2014 2:36 AM
To: Rao,Xiayu; bioconductor at r-project.org
Subject: Re: dexseq read.HTSeqCounts error

Hi Xiayu,

Please keep your answers in the mailing list, so it can be useful for users encountering similar issues.

Yes, that is the problem, the function is expecting the same number of lines for each file (it is expecting the users to use the same annotation file or the same features as counting bins for all their samples).  What you could do is read independently your files within R, generate a matrix of the union of all possible counting bins (the first column of your files) times samples (columns), and then fill them in with the information from your files, leaving a 0 on the corresponding missing values. Then use the function newExonCountSet to generate your ecs file.

Best regards,
Alejandro





> Hi, Alejandro
>
> Thank you for your prompt response. Below is the output. I think my files are OK, but they have different number of lines. So you think that is the issue?? I am not counting reads that locate in exonic regions, so I do not have a .gtf file. I am counting the reads within some splice sites. Do you have any suggestions for me in my case?
>
>> all( file.exists(as.character(sampleTable$countfiles)) )
> [1] TRUE
>
> -sh-4.1$ wc -l *count.txt
>     222604 SRR791043_count.txt
>     215908 SRR791044_count.txt
>     232519 SRR791045_count.txt
>     227275 SRR791046_count.txt
>     241367 SRR791047_count.txt
>     253112 SRR791048_count.txt
>     234680 SRR791049_count.txt
>     199632 SRR791050_count.txt
>     208880 SRR791051_count.txt
>     195265 SRR791052_count.txt
>     .........
>
> Thanks,
> Xiayu
>
>
> -----Original Message-----
> From: Alejandro Reyes [mailto:alejandro.reyes at embl.de]
> Sent: Tuesday, April 15, 2014 11:44 AM
> To: Rao,Xiayu; 'bioconductor at r-project.org'
> Subject: Re: dexseq read.HTSeqCounts error
>
> Dear Xiayu Rao,
>
> What is the output of:
>
> all( file.exists(as.character(sampleTable$countfiles)) )
>
> ?
>
> Maybe one of your files is corrupted? Do they have all the same number of lines?
>
> Best regards,
> Alejandro
>
>
>> Hi, Alejandro and others
>>
>> Thank you for your information on the forum, which is very helpful. I now have a problem in running dexseq. I need your advice to solve the problem. Any input would be very appreciated.
>>
>>> ecs <-
>>> read.HTSeqCounts(countfiles=as.character(sampleTable$countFile),desi
>>> g
>>> n=sampleTable, flattenedfile=NULL)
>> Error in `rownames<-`(`*tmp*`, value = c("ENSG00000000003:001", "ENSG00000000003:002",  :
>>     attempt to set 'rownames' on an object with no dimensions
>>
>> I then checked:
>>> dim(lf)
>> NULL
>>> dim(dcounts)
>> NULL
>>
>> I generated count files by myself instead of using dexseq_count.py. I tried to follow the standard count file format:
>> -sh-4.1$ more SRR791043_count.txt
>> ENSG00000000003:001     284
>> ENSG00000000003:002     179
>> ENSG00000000003:003     275
>> ENSG00000000003:004     156
>> ENSG00000000003:005     177
>> ENSG00000000003:006     157
>> ENSG00000000003:007     9
>> ENSG00000000003:008     45
>> ENSG00000000003:009     4
>> ENSG00000000003+ENSG00000102362:001     3
>> ............
>> Intergene1:001       1
>> .........
>> intergene2097:001       11
>> intergene2098:001       99
>> intergene2099:001       54
>> intergene2100:001       3
>> intergene2101:001       3
>> intergene2102:001       20
>> intergene2103:001       1
>> intergene2104:001       2
>> intergene2105:001       1
>> intergene2106:001       10
>> _ambiguous      0
>> _empty  0
>> _lowaqual       0
>> _notaligned     0
>>
>>
>>> sampleTable
>>                                                 countFile condition
>> SRR791052 SRR791052_count.txt     HER2+
>> SRR791054 SRR791054_count.txt     HER2+
>> SRR791056 SRR791056_count.txt     HER2+
>> SRR791057 SRR791057_count.txt     HER2+
>> SRR791058 SRR791058_count.txt     HER2+
>> SRR791059 SRR791059_count.txt     HER2+
>> SRR791061 SRR791061_count.txt     HER2+
>> SRR791062 SRR791062_count.txt     HER2+
>> SRR791066 SRR791066_count.txt    Benign
>> SRR791067 SRR791067_count.txt    Benign
>> SRR791068 SRR791068_count.txt    Benign
>> SRR791069 SRR791069_count.txt    Benign
>> SRR791070 SRR791070_count.txt    Benign
>> SRR791071 SRR791071_count.txt    Benign
>> SRR791072 SRR791072_count.txt    Benign
>> SRR791073 SRR791073_count.txt    Benign
>>
>> Note: I am using R 3.0.3 and DEXSeq 1.8.0
>>
>>
>> Thanks,
>> Xiayu
>>



More information about the Bioconductor mailing list