[Bioc-sig-seq] size of DNAString object

Hans-Ulrich Klein h.klein at uni-muenster.de
Thu May 27 16:21:52 CEST 2010


Hi all,

I observed that some DNAStrings (and also DNAStringSets) objects are to 
large after subsetting:

 > library("Rsamtools")
 > parameters = ScanBamParam()
 > bam = scanBam("data/N01.bam", param=parameters)
 > ss = bam[[1]]$seq
 > ss
   A DNAStringSet instance of length 230980
   [...]
 > print(object.size(ss), units="Mb")
83.3 Mb
 > dnaS = ss[[5]]
 > dnaS
   128-letter "DNAString" instance
seq: 
TAGCGTGGATACAGAGGGACATCTATTGACCAGCTA...AAAGTTGTGCTTTATTTGATGAATAAGTATTGAACA
 > print(object.size(dnaS), units="Mb")
80.7 Mb
 > print(object.size(as.character(dnaS)), units="Kb")
0.2 Kb

When I write the 128-letter DNAString to disk, it remains quite large (~ 
20Mb).

Best wishes,
Hans-Ulrich




 > sessionInfo()
R version 2.11.0 (2010-04-22)
x86_64-pc-linux-gnu

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] Rsamtools_1.0.1     Biostrings_2.16.2   GenomicRanges_1.0.1
[4] IRanges_1.6.4

loaded via a namespace (and not attached):
[1] Biobase_2.8.0


-- 
Hans-Ulrich Klein
Department of Medical Informatics and Biomathematics
University of Münster
Domagkstrasse 9
48149 Münster, Germany
Tel.: +49 (0)251 83-58405



More information about the Bioc-sig-sequencing mailing list