[Bioc-devel] requirement for named assays in SummarizedExperiment

Ryan rct at thompsonclan.org
Thu Mar 12 18:01:15 CET 2015


Yes, a single-assay SummarizedExperiment would be the most common case 
for unnamed assays. But I think at the very least there should be a 
warning on unnamed assays.

On 3/12/15 9:24 AM, Martin Morgan wrote:
> On 03/12/2015 08:12 AM, Tim Triche, Jr. wrote:
>> What he said
>>
>> This doesn't make any sense from an API perspective.  When would a 
>> user ever expect to see unnamed assay matrices?
>>
>
> When there's a single assay?
>
>> --t
>>
>>> On Mar 12, 2015, at 7:46 AM, Kasper Daniel Hansen 
>>> <kasperdanielhansen at gmail.com> wrote:
>>>
>>> allowing positional matching strikes me as being far too fragile.
>>> Depending on the actual implementation, it may not even be clear 
>>> there is
>>> an order of the assays.
>>>
>>> On Wed, Mar 11, 2015 at 2:45 PM, Valerie Obenchain 
>>> <vobencha at fredhutch.org>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> After talking with others the vote was against enforcing names on 
>>>> assays()
>>>> and for positional matching if all names are NULL. A mixture of 
>>>> names and
>>>> NULL throws an error.
>>>>
>>>> example(SummarizedExperiment)
>>>>
>>>> ## all named
>>>>> se2 = se1
>>>>> assays(cbind(se1, se2))
>>>> List of length 1
>>>> names(1): counts
>>>>
>>>> ## mixture of names and NULL -> error
>>>>> names(assays(se1)) = NULL
>>>>> assays(cbind(se1, se2))
>>>> Error in assays(cbind(se1, se2)) :
>>>>   error in evaluating the argument 'x' in selecting a method for 
>>>> function
>>>> 'assays': Error in .bind.arrays(args, cbind, "assays") :
>>>>   elements in ‘assays’ must have the same names
>>>>
>>>> ## all NULL -> positional matching
>>>>> names(assays(se2)) = NULL
>>>>> assays(cbind(se1, se2))
>>>> List of length 1
>>>>
>>>> If we find common use cases where positional matching is needed with a
>>>> mixture of names and NULL we can always relax this constraint.
>>>>
>>>> Changes are in 1.19.46.
>>>>
>>>> Valerie
>>>>
>>>>
>>>>
>>>>
>>>>> On 03/06/2015 08:20 AM, Valerie Obenchain wrote:
>>>>>
>>>>> Hi Aaron,
>>>>>
>>>>> Thanks for catching this.
>>>>>
>>>>> I favor enforcing names in 'assays'. Combining by position alone 
>>>>> is too
>>>>> dangerous. I'm thinking of the VCF class where the genome 
>>>>> information is
>>>>> stored in 'assays' and the fields are rarely in the same order.
>>>>>
>>>>> Looks like we also need a more informative error message when names
>>>>> don't match.
>>>>>
>>>>>> assays(se1)
>>>>> List of length 1
>>>>> names(1): counts1
>>>>>
>>>>>> assays(se2)
>>>>> List of length 1
>>>>> names(1): counts2
>>>>>
>>>>>> cbind(se1, se2)
>>>>> Error in sQuote(accessorName) :
>>>>>    argument "accessorName" is missing, with no default
>>>>>
>>>>>
>>>>> Valerie
>>>>>
>>>>>
>>>>>> On 03/05/2015 11:09 PM, Aaron Lun wrote:
>>>>>>
>>>>>> Dear all,
>>>>>>
>>>>>> I stumbled upon some unexpected behaviour with cbind'ing
>>>>>> SummarizedExperiment objects with unnamed assays:
>>>>>>
>>>>>> require(GenomicRanges)
>>>>>>> nrows <- 5; ncols <- 4
>>>>>>> counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)
>>>>>>> rowData <- GRanges("chr1", IRanges(1:nrows, 1:nrows))
>>>>>>> colData <- DataFrame(Treatment=1:ncols, row.names=LETTERS[1:ncols])
>>>>>>> sset <- SummarizedExperiment(counts, rowData=rowData, 
>>>>>>> colData=colData)
>>>>>>> sset
>>>>>> class: SummarizedExperiment
>>>>>> dim: 5 4
>>>>>> exptData(0):
>>>>>> assays(1): ''
>>>>>> rownames: NULL
>>>>>> rowData metadata column names(0):
>>>>>> colnames(4): A B C D
>>>>>> colData names(1): Treatment
>>>>>>
>>>>>>>
>>>>>>> cbind(sset, sset)
>>>>>> dim: 5 8
>>>>>> exptData(0):
>>>>>> assays(0):
>>>>>> rownames: NULL
>>>>>> rowData metadata column names(0):
>>>>>> colnames(8): A B ... C1 D1
>>>>>> colData names(1): Treatment
>>>>>>
>>>>>> Upon cbind'ing, the assays in the SE object are lost. I think 
>>>>>> this is
>>>>>> due to the fact that the cbind code matches up assays by their 
>>>>>> names.
>>>>>> Thus, if there are no names, the code assumes that there are no 
>>>>>> assays.
>>>>>>
>>>>>> I guess this could be prevented by enforcing naming of assays in the
>>>>>> SummarizedExperiment constructor. Or, the binding code could be 
>>>>>> modified
>>>>>> to work positionally when there are no assay names, e.g., by 
>>>>>> cbind'ing
>>>>>> the first assays across all SE objects, then the second assays, etc.
>>>>>>
>>>>>> Any thoughts?
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Aaron
>>>>>>
>>>>>> sessionInfo()
>>>>>> R Under development (unstable) (2014-12-14 r67167)
>>>>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>>>>
>>>>>> locale:
>>>>>>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>>>>>>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>>>>>>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
>>>>>>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>>>>>>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
>>>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>>>>
>>>>>> attached base packages:
>>>>>> [1] stats4    parallel  stats     graphics  grDevices utils
>>>>>> datasets
>>>>>> [8] methods   base
>>>>>>
>>>>>> other attached packages:
>>>>>> [1] GenomicRanges_1.19.42 GenomeInfoDb_1.3.13 IRanges_2.1.41
>>>>>> [4] S4Vectors_0.5.21      BiocGenerics_0.13.6
>>>>>>
>>>>>> loaded via a namespace (and not attached):
>>>>>> [1] XVector_0.7.4
>>>>>>
>>>>>>
>>>>>> ______________________________________________________________________ 
>>>>>>
>>>>>> The information in this email is confidential and 
>>>>>> inte...{{dropped:15}}
>>>>>
>>>>> _______________________________________________
>>>>> Bioc-devel at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>>
>>>>
>>>> -- 
>>>> Computational Biology / Fred Hutchinson Cancer Research Center
>>>> 1100 Fairview Ave. N, Seattle, WA 98109
>>>>
>>>> Email: vobencha at fredhutch.org
>>>> Phone: (206) 667-3158
>>>>
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>
>>>     [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioc-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
>



More information about the Bioc-devel mailing list