[Bioc-devel] GenomicRanges::assays
Martin Morgan
mtmorgan at fhcrc.org
Wed Jul 10 00:34:09 CEST 2013
The problem is that the dimnames are stored in only one location, and this is not on the assays. When you ask for the assays, the dimnames are added, triggering a full copy of the data. If the dimnames are not of interest, then
assays(BS, withDimnames=FALSE)
This is not really ideal, so I'll give some thought to a better implementation.
Martin
----- Kasper Daniel Hansen <kasperdanielhansen at gmail.com> wrote:
> Note the final "s" in assays. It is super slow. This is a BSseq object
> with 28M rows and 7 columns, which means there are two assays M and Cov
> each being 28M x 7 (which is pretty big, on the Gb scale)
>
> These two commands retrieve the same data as far as I understand.
>
> > system.time({BS at assays$field("data")})
> user system elapsed
> 0 0 0
> > system.time({assays(BS)})
> user system elapsed
> 19.677 10.436 30.114
>
> Follow up question:
>
> 1) It seems that all assays are stored in a SimpleList inside a reference
> class. If I only want to replace one of the assays, like
> assay(Object, "NAME") <- value
> does this mean that all assays are being copied? Is this different from
> say eSet where each assay is a matrix in an environment?
>
> 2) I think we need a convenience function for the assay names of a
> SummarizedExperiment. (This is how I saw the issue above, I was using
> names(assays(Object)))
>
> Kasper
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
More information about the Bioc-devel
mailing list