[Bioc-devel] GenomicRanges::assays

Martin Morgan mtmorgan at fhcrc.org
Wed Jul 10 00:34:09 CEST 2013


The problem is that the dimnames are stored in only one location, and this is not on the assays. When you ask for the assays, the dimnames are added, triggering a full copy of the data. If the dimnames are not of interest, then

  assays(BS, withDimnames=FALSE)

This is not really ideal, so I'll give some thought to a better implementation.

Martin
----- Kasper Daniel Hansen <kasperdanielhansen at gmail.com> wrote:
> Note the final "s" in assays.  It is super slow.  This is a BSseq object
> with 28M rows and 7 columns, which means there are two assays M and Cov
> each being 28M x 7 (which is pretty big, on the Gb scale)
> 
> These two commands retrieve the same data as far as I understand.
> 
> > system.time({BS at assays$field("data")})
>    user  system elapsed
>       0       0       0
> > system.time({assays(BS)})
>    user  system elapsed
>  19.677  10.436  30.114
> 
> Follow up question:
> 
> 1) It seems that all assays are stored in a SimpleList inside a reference
> class.  If I only want to replace one of the assays, like
>   assay(Object, "NAME") <- value
> does this mean that all assays are being copied?  Is this different from
> say eSet where each assay is a matrix in an environment?
> 
> 2) I think we need a convenience function for the assay names of a
> SummarizedExperiment.  (This is how I saw the issue above, I was using
> names(assays(Object)))
> 
> Kasper
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel



More information about the Bioc-devel mailing list