[Bioc-devel] 'droplevels' argument in `[` method for SummarizedExperiment?

Steve Lianoglou lianoglou.steve at gene.com
Thu Mar 13 00:26:16 CET 2014


On Wed, Mar 12, 2014 at 3:52 PM, Michael Lawrence
<lawrence.michael at gene.com> wrote:
> On Wed, Mar 12, 2014 at 3:45 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
>
>> On 03/12/2014 03:02 PM, Wolfgang Huber wrote:
>>
>>> Hi Martin, Mike
>>>
>>> a DESeq2 user brought up the observation that when he subsets a
>>> 'DESeqDataSet' object (the class inherits from 'SummarizedExperiment') by
>>> samples, he often ends up with unused factor levels in the colData. (Esp.
>>> since the subsetting is often to select certain subgroups). Would either of
>>> the following two make sense:
>>>
>>> - a 'droplevels' method for 'SummarizedExperiment' that efficiently and
>>> conveniently removes unused levels, i.e.
>>>       x = x[, x$tissue %in% c("guts", "brains")]
>>>       x = droplevels(x)
>>>
>>
>> vs. x$tissue = droplevels(x$tissue)

Or do:

colData(x) <- droplevels(colData(x))

>> there are a surprising number of places were levels could be dropped --
>> each column of colData, each column of (possibly two levels of) 'mcols' on
>> the row data, and the seqlevels of the row data.

Perhaps true, however in Wolfgan'g case (the DESeqDataSet) I think
most people would want that to work over the colData of the object.

In which case, perhaps DESeq2 should just define
droplevels,DESeqDataSet to work over the colData of the dds for to
enable that convenience.

-steve

-- 
Steve Lianoglou
Computational Biologist
Genentech



More information about the Bioc-devel mailing list