[Bioc-devel] SummarizedExperiment

Martin Morgan mtmorgan at fhcrc.org
Mon Mar 24 16:07:19 CET 2014


On 03/20/2014 06:29 PM, Kasper Daniel Hansen wrote:
> It used to be the case that when a SummarizedExperiment was constructed,
> dim names was removed from the matrices in assay.  One could then either use
> (1)  assay(, withDimnames = TRUE)
> which ensured dim names in the return value, but implied copying of the
> return object because the dim names had to get added, or
> (2) assay(, withDimnames = FALSE)
> which ensured that the return object had no dim names (because they were
> stripped).
>
> It seems in a recent commit (based on log message I am guessing the two
> copied in at the bottom of the email, dim names are not stripped at
> construction.  This implies that
>    assay(, withDimnames = FALSE)
> returns an object with the dimnames because they are already present in the
> raw object.
>
> Now, my questions are
> (1) can I depend on this behavior?

yes.

> (2) Is there any check that the dimnames which may be present in the 'raw'
> assay object are in line with what I get from assay(withDimnames = TRUE) or
> could I imagine getting different dimnames (and not just no dimnames vs
> with dimnames) depending on withDimnames?
>

more structure will be imposed; the dimnames of the overall object will agree 
with the dimnames of the assays.

> To get some context, in bsseq I always use withDimnames=FALSE because the
> assay matrices are big (28M rows), so I want to avoid copying.  But now I
> get a failed test, since I construct an object with colnames in the assay.
>   This seems to be an esoteric point, but it has performance implications in
> my usage.  I don't know what the right design is - I like that renaming
> things are quick, because it only happens in the colData slot.

I think stripping the dimnames from assays was a mistake -- it saves space (but 
not much compared to the assay data) but causes a performance bottleneck in 
normal use (when the dimnames are copied to the assay data) so I think it makes 
sense to just duplicate / check dimnames. This is the direction I'll go in, 
unless there are other opinions.

>
> Finally, it seems that the NEWS file in GenomicRanges is no longer
> maintained.  Is this intentional ? :(

the *Ranges tradition seems to be to update the NEWS files prior to release, 
rather than during development. So for instance

------------------------------------------------------------------------
r87773 | hpages at fhcrc.org | 2014-03-24 00:49:50 -0700 (Mon, 24 Mar 2014) | 1 line

start to update NEWS file with changes in the upcoming 1.16.0 version

>
> Best,
> Kasper
>
>
> r77404 | mtmorgan at fhcrc.org | 2013-06-11 15:52:25 -0400 (Tue, 11 Jun 2013)
> | 5 lines
>
> relax SummarizedExperiment assays class validity
>
> - dim() of length >= 2
> - does not guarantee functionality; may be altered in the future
>
> ------------------------------------------------------------------------
> r76679 | mtmorgan at fhcrc.org | 2013-05-16 17:12:43 -0400 (Thu, 16 May 2013)
> | 4 lines
>
> more efficient ref class constructor
>
> - new empty instance, the update slots
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-devel mailing list