[Bioc-sig-seq] size of DNAString object

Hans-Ulrich Klein h.klein at uni-muenster.de
Thu May 27 17:36:30 CEST 2010


Thank you! That helped me a lot.

A further question: Is there any way to access the complete DNAStringSet 
"dnaS" after I removed it using the rm() function? If not, keeping the 
complete DNAStringSet in memory does not make much sense to me.

Thank you,
Hans-Ulrich



Vincent Carey wrote:
> see the information on compact() method in XStringSet-class          
> package:Biostrings           R Documentation
>
> to rationalize this you need to think about the difference between a 
> view and a concrete instance.  typically you do not want a copy to be 
> made on each view
>
> On Thu, May 27, 2010 at 10:21 AM, Hans-Ulrich Klein 
> <h.klein at uni-muenster.de <mailto:h.klein at uni-muenster.de>> wrote:
>
>     Hi all,
>
>     I observed that some DNAStrings (and also DNAStringSets) objects
>     are to large after subsetting:
>
>     > library("Rsamtools")
>     > parameters = ScanBamParam()
>     > bam = scanBam("data/N01.bam", param=parameters)
>     > ss = bam[[1]]$seq
>     > ss
>      A DNAStringSet instance of length 230980
>      [...]
>     > print(object.size(ss), units="Mb")
>     83.3 Mb
>     > dnaS = ss[[5]]
>     > dnaS
>      128-letter "DNAString" instance
>     seq:
>     TAGCGTGGATACAGAGGGACATCTATTGACCAGCTA...AAAGTTGTGCTTTATTTGATGAATAAGTATTGAACA
>     > print(object.size(dnaS), units="Mb")
>     80.7 Mb
>     > print(object.size(as.character(dnaS)), units="Kb")
>     0.2 Kb
>
>     When I write the 128-letter DNAString to disk, it remains quite
>     large (~ 20Mb).
>
>     Best wishes,
>     Hans-Ulrich
>
>
>
>
>     > sessionInfo()
>     R version 2.11.0 (2010-04-22)
>     x86_64-pc-linux-gnu
>
>     locale:
>     [1] C
>
>     attached base packages:
>     [1] stats     graphics  grDevices utils     datasets  methods   base
>
>     other attached packages:
>     [1] Rsamtools_1.0.1     Biostrings_2.16.2   GenomicRanges_1.0.1
>     [4] IRanges_1.6.4
>
>     loaded via a namespace (and not attached):
>     [1] Biobase_2.8.0
>
>
>     -- 
>     Hans-Ulrich Klein
>     Department of Medical Informatics and Biomathematics
>     University of Münster
>     Domagkstrasse 9
>     48149 Münster, Germany
>     Tel.: +49 (0)251 83-58405
>
>     _______________________________________________
>     Bioc-sig-sequencing mailing list
>     Bioc-sig-sequencing at r-project.org
>     <mailto:Bioc-sig-sequencing at r-project.org>
>     https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>


-- 
Hans-Ulrich Klein
Department of Medical Informatics and Biomathematics
University of Münster
Domagkstrasse 9
48149 Münster, Germany
Tel.: +49 (0)251 83-58405



More information about the Bioc-sig-sequencing mailing list