[Bioc-devel] requirement for named assays in SummarizedExperiment

Martin Morgan mtmorgan at fredhutch.org
Thu Mar 12 17:24:09 CET 2015


On 03/12/2015 08:12 AM, Tim Triche, Jr. wrote:
> What he said
>
> This doesn't make any sense from an API perspective.  When would a user ever expect to see unnamed assay matrices?
>

When there's a single assay?

> --t
>
>> On Mar 12, 2015, at 7:46 AM, Kasper Daniel Hansen <kasperdanielhansen at gmail.com> wrote:
>>
>> allowing positional matching strikes me as being far too fragile.
>> Depending on the actual implementation, it may not even be clear there is
>> an order of the assays.
>>
>> On Wed, Mar 11, 2015 at 2:45 PM, Valerie Obenchain <vobencha at fredhutch.org>
>> wrote:
>>
>>> Hi,
>>>
>>> After talking with others the vote was against enforcing names on assays()
>>> and for positional matching if all names are NULL. A mixture of names and
>>> NULL throws an error.
>>>
>>> example(SummarizedExperiment)
>>>
>>> ## all named
>>>> se2 = se1
>>>> assays(cbind(se1, se2))
>>> List of length 1
>>> names(1): counts
>>>
>>> ## mixture of names and NULL -> error
>>>> names(assays(se1)) = NULL
>>>> assays(cbind(se1, se2))
>>> Error in assays(cbind(se1, se2)) :
>>>   error in evaluating the argument 'x' in selecting a method for function
>>> 'assays': Error in .bind.arrays(args, cbind, "assays") :
>>>   elements in ‘assays’ must have the same names
>>>
>>> ## all NULL -> positional matching
>>>> names(assays(se2)) = NULL
>>>> assays(cbind(se1, se2))
>>> List of length 1
>>>
>>> If we find common use cases where positional matching is needed with a
>>> mixture of names and NULL we can always relax this constraint.
>>>
>>> Changes are in 1.19.46.
>>>
>>> Valerie
>>>
>>>
>>>
>>>
>>>> On 03/06/2015 08:20 AM, Valerie Obenchain wrote:
>>>>
>>>> Hi Aaron,
>>>>
>>>> Thanks for catching this.
>>>>
>>>> I favor enforcing names in 'assays'. Combining by position alone is too
>>>> dangerous. I'm thinking of the VCF class where the genome information is
>>>> stored in 'assays' and the fields are rarely in the same order.
>>>>
>>>> Looks like we also need a more informative error message when names
>>>> don't match.
>>>>
>>>>> assays(se1)
>>>> List of length 1
>>>> names(1): counts1
>>>>
>>>>> assays(se2)
>>>> List of length 1
>>>> names(1): counts2
>>>>
>>>>> cbind(se1, se2)
>>>> Error in sQuote(accessorName) :
>>>>    argument "accessorName" is missing, with no default
>>>>
>>>>
>>>> Valerie
>>>>
>>>>
>>>>> On 03/05/2015 11:09 PM, Aaron Lun wrote:
>>>>>
>>>>> Dear all,
>>>>>
>>>>> I stumbled upon some unexpected behaviour with cbind'ing
>>>>> SummarizedExperiment objects with unnamed assays:
>>>>>
>>>>> require(GenomicRanges)
>>>>>> nrows <- 5; ncols <- 4
>>>>>> counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)
>>>>>> rowData <- GRanges("chr1", IRanges(1:nrows, 1:nrows))
>>>>>> colData <- DataFrame(Treatment=1:ncols, row.names=LETTERS[1:ncols])
>>>>>> sset <- SummarizedExperiment(counts, rowData=rowData, colData=colData)
>>>>>> sset
>>>>> class: SummarizedExperiment
>>>>> dim: 5 4
>>>>> exptData(0):
>>>>> assays(1): ''
>>>>> rownames: NULL
>>>>> rowData metadata column names(0):
>>>>> colnames(4): A B C D
>>>>> colData names(1): Treatment
>>>>>
>>>>>>
>>>>>> cbind(sset, sset)
>>>>> dim: 5 8
>>>>> exptData(0):
>>>>> assays(0):
>>>>> rownames: NULL
>>>>> rowData metadata column names(0):
>>>>> colnames(8): A B ... C1 D1
>>>>> colData names(1): Treatment
>>>>>
>>>>> Upon cbind'ing, the assays in the SE object are lost. I think this is
>>>>> due to the fact that the cbind code matches up assays by their names.
>>>>> Thus, if there are no names, the code assumes that there are no assays.
>>>>>
>>>>> I guess this could be prevented by enforcing naming of assays in the
>>>>> SummarizedExperiment constructor. Or, the binding code could be modified
>>>>> to work positionally when there are no assay names, e.g., by cbind'ing
>>>>> the first assays across all SE objects, then the second assays, etc.
>>>>>
>>>>> Any thoughts?
>>>>>
>>>>> Regards,
>>>>>
>>>>> Aaron
>>>>>
>>>>> sessionInfo()
>>>>> R Under development (unstable) (2014-12-14 r67167)
>>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>>>
>>>>> locale:
>>>>>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>>>>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>>>>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>>>>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>>>>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>>>
>>>>> attached base packages:
>>>>> [1] stats4    parallel  stats     graphics  grDevices utils
>>>>> datasets
>>>>> [8] methods   base
>>>>>
>>>>> other attached packages:
>>>>> [1] GenomicRanges_1.19.42 GenomeInfoDb_1.3.13   IRanges_2.1.41
>>>>> [4] S4Vectors_0.5.21      BiocGenerics_0.13.6
>>>>>
>>>>> loaded via a namespace (and not attached):
>>>>> [1] XVector_0.7.4
>>>>>
>>>>>
>>>>> ______________________________________________________________________
>>>>> The information in this email is confidential and inte...{{dropped:15}}
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>>
>>> --
>>> Computational Biology / Fred Hutchinson Cancer Research Center
>>> 1100 Fairview Ave. N, Seattle, WA 98109
>>>
>>> Email: vobencha at fredhutch.org
>>> Phone: (206) 667-3158
>>>
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>>     [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioc-devel mailing list