[Bioc-sig-seq] 'coverage' error message

Patrick Aboyoun paboyoun at fhcrc.org
Mon Jan 4 22:47:17 CET 2010


P.,
As the error message suggests, there is a mismatch between 
names(arab.chromlens) and levels(chromosome(alns)), meaning the 
chromosome lengths vector and the AlignedRead object are not in sync. 
The aligned reads for this experiment were from a mouse model, not 
arabidopsis thaliana, so you would need to reference 
BSgenome.Mmusculus.UCSC.mm9 when performing these operations:


 > filt1 <- alignDataFilter(expression(filtering=="Y"))
 > filt2 <- chromosomeFilter("chr[0-9XYM]+.fa")
 > filt <- compose(filt1, filt2)

 > alns <- readAligned(extdataDir, pattern, type="SolexaExport", 
filter=filt)
 > alns
class: AlignedRead
length: 195719 reads; width: 35 cycles
chromosome: chr11.fa chr9.fa ... chr8.fa chr4.fa
position: 104853312 3036336 ... 44295163 47191474
strand: - - ... - -
alignQuality: NumericQuality
alignData varLabels: run lane ... filtering contig

 > levels(alns at chromosome) <- sub(".fa$", "", levels(chromosome(alns)))

 > library(BSgenome.Mmusculus.UCSC.mm9)
 > mm9.chromlens <- seqlengths(Mmusculus)
 > head(mm9.chromlens)
     chr1      chr2      chr3      chr4      chr5      chr6
197195432 181748087 159599783 155630120 152537259 149517037

 > cov.mm9 <- coverage(alns, width = mm9.chromlens, extend = 126L)
 > cov.mm9
SimpleRleList of length 22
$chr1
'integer' Rle of length 197195432 with 27263 runs
  Lengths:  3018534 161 16703 161 68815 161 33063 161 58217 161 ...
  Values :  0 1 0 1 0 1 0 1 0 1 ...

$chr10
'integer' Rle of length 129993255 with 21699 runs
  Lengths:  3019736 161 11311 161 4238 161 10661 161 793 161 ...
  Values :  0 1 0 1 0 1 0 1 0 1 ...

$chr11
'integer' Rle of length 121843856 with 22105 runs
  Lengths:  3000315 6 40 79 9 4 23 6 2 38 ...
  Values :  0 1 2 3 4 5 6 5 4 5 ...

$chr12
'integer' Rle of length 121257530 with 18183 runs
  Lengths:  3002552 161 6903 161 4375 161 5041 161 2491 161 ...
  Values :  0 1 0 1 0 1 0 1 0 1 ...

$chr13
'integer' Rle of length 120284312 with 15907 runs
  Lengths:  3001262 161 5650 161 29080 161 111 40 121 40 ...
  Values :  0 1 0 1 0 1 0 1 2 1 ...

...
<17 more elements>
 > sessionInfo()
R version 2.11.0 Under development (unstable) (2010-01-02 r50884)
i386-apple-darwin9.8.0

locale:
[1] C/C/C/C/C/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets
[6] methods   base    

other attached packages:
[1] BSgenome.Mmusculus.UCSC.mm9_1.3.16    
[2] BSgenome.Athaliana.TAIR.04232008_1.3.16
[3] ShortReadTutorial_0.0.1               
[4] ShortRead_1.5.10                      
[5] lattice_0.17-26                       
[6] BSgenome_1.15.3                       
[7] Biostrings_2.15.11                    
[8] IRanges_1.5.23                        

loaded via a namespace (and not attached):
[1] Biobase_2.7.3 grid_2.11.0   hwriter_1.1   tools_2.11.0


Cheers,
Patrick



pterry at huskers.unl.edu wrote:
> Dear bioc-sig-sequencing,
>
> I am trying to analyze Eland aligned files for differential expression, using the 'A ChIP-Seq Data Analysis' handout from a 11/19/09 session at the 'High throughput sequence analysis tools and approaches with Bioconductor' workshop in Seattle.
>
> I generated an error message in the following output.  Can you comment?
>
> ...
>
>   
>> alns_8 <- readAligned(cdataDir, pattern, "SolexaExport")
>> alns_8
>>     
> class: AlignedRead
> length: 1380439 reads; width: 35 cycles
> chromosome: chr1.fas chr1.fas ... chr1.fas chr1.fas
> position: 7568294 167488 ... 4687256 5376960
> strand: + + ... + +
> alignQuality: NumericQuality
> alignData varLabels: run lane ... filtering contig
>   
>> head(sread(alns_8))
>>     
>   A DNAStringSet instance of length 6
>     width seq
> [1]    35 AGCTATGATCAAGAGAACCTTTCACGATCANNNCN
> [2]    35 CGGACGACGGGTAGTTTCGGGCTGTACCAANNNAN
> [3]    35 AGCTCAGCGATCTGAGCCACTTGCTCTTTGNNNTN
> [4]    35 GGGCCATAGGCCCGTTAAAATATTTTTCTCTNNCT
> [5]    35 ATTGTCCATTGACAAATGAAGATATTGGGATNNTT
> [6]    35 ACCCCTCCACCAGTATGTTGGCGAAAATCTCNNCC
>   
>> table(strand(alns_8), useNA="ifany")
>>     
>
>      -      +      *
> 689912 690527      0
>
> ...
>
>   
>> library(BSgenome.Athaliana.TAIR.04232008)
>> arab.chromlens <- seqlengths(Athaliana)
>> head(arab.chromlens)
>>     
>     chr1     chr2     chr3     chr4     chr5     chrC
> 30432563 19705359 23470805 18585042 26992728   154478
>   
>> cov.arab8 <- coverage(alns_8, width = arab.chromlens, extend = 126L)
>>     
> Error: UserArgumentMismatch
>   'names(width)' (or 'names(end)') mismatch with 'levels(chromosome(x))'
>   see ?"AlignedRead-class"
>
>   
>> sessionInfo()
>>     
> R version 2.10.1 (2009-12-14)
> x86_64-pc-linux-gnu
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] BSgenome.Athaliana.TAIR.04232008_1.3.16
> [2] chipseq_0.2.0
> [3] ShortRead_1.4.0
> [4] lattice_0.17-26
> [5] BSgenome_1.14.0
> [6] Biostrings_2.14.1
> [7] IRanges_1.4.2
>
> loaded via a namespace (and not attached):
> [1] Biobase_2.6.0 grid_2.10.1   hwriter_1.1
>   
>
>
> Thanks,
> P. Terry
> pterry at huskers.unl.edu
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>



More information about the Bioc-sig-sequencing mailing list