[BioC] BStringSet not work with lists of many DNAString elements

heyi xiao xiaoheyiyh at yahoo.com
Fri Oct 18 19:46:46 CEST 2013


Dear all,
I try used the Biostrings/BSgenome utilities to extract DNA sequences for Entrez genes. It worked fine till I am ready to output the extracted sequence to a fasta file. Because writeXStringSet is the only function for writing fasta files, which only works with an XStringSet object. I need to convert my list of DNAString objects into an XStringSet object. Unfortunately, the converter/constructor BStringSet only works with lists of a few DNAString elements. It produces error on larger lists as below. Not sure how to deal with the issue. Thanks for any suggestions/inputs in advance!
Heyi

>  exonSeq.set=BStringSet(exonSeq.list[1:30])
Error in .Call2("SharedVector_mcopy", dest, dest.offset, src, src.start,  : 
  subscript out of bounds
>  exonSeq.set=BStringSet(exonSeq.list[1:25])
>  exonSeq.set=BStringSet(exonSeq.list[1:26])
Error in .Call2("SharedVector_mcopy", dest, dest.offset, src, src.start,  : 
  subscript out of bounds
>  exonSeq.set=BStringSet(exonSeq.list[26:30])
>  exonSeq.set=BStringSet(exonSeq.list[26:40])
Error in .Call2("SharedVector_mcopy", dest, dest.offset, src, src.start,  : 
  subscript out of bounds

> head(exonSeq.list,3)
$`442993`
  133057-letter "DNAString" instance
seq: TGAGACGGCTTTTATTCCTGAGCTTCTGCTGCTCAC...AAAGCTGTCATCAATGAAAAAAGGTAAGAGAAAAAC

$`442994`
  23917-letter "DNAString" instance
seq: CAGTTCTGACCCACTTCAAGGTTACATCTCCAAGGT...CTTACGATTTTTGCAGATAAAAAATTTATCTGCAAA

$`442995`
  21718-letter "DNAString" instance
seq: GTCTTCTCTCCTTGCTGCTCTCAGGTAGGGGCTGGG...GGAAGAAGCAGAATAAAGCAATTTTCCTTGAAGTGA

> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
[1] C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] BSgenome.Oaries.NCBI.Oar3.1_1.0 Biobase_2.21.6                 
[3] BSgenome_1.29.0                 Biostrings_2.29.14             
[5] GenomicRanges_1.13.35           XVector_0.1.0                  
[7] IRanges_1.19.19                 BiocGenerics_0.7.3             

loaded via a namespace (and not attached):
[1] stats4_3.0.1 tools_3.0.1



More information about the Bioconductor mailing list