[Bioc-devel] rownames in SummerizedExperiments

Simon Anders anders at embl.de
Sun Apr 6 23:48:47 CEST 2014


Hi Michael

On 06/04/14 23:32, Michael Lawrence wrote:
> On an arbitrary vector, the names do not need to be unique, but they DO
> need to be unique on a DataFrame (according to the data.frame
> conventions). Conditioning on whether there are duplicate names would be
> too complicated, so it is left to the user to declare whether the names
> are expected on the result. Since in general the vector names are not
> valid rownames, the default is FALSE. I guess if we really wanted to be
> consistent with R, we would mangle the names to make them unique, but
> that check is expensive.

Thanks for the response, but I'm not sure I understand it. I thought
"use.names=TRUE" instructs "mcols" to use the rownames of the
SummerizedExperiment object as rownames for the returned DataFrame. Now,
as the rownames of the SummerizedExperiment have to be unique anyway (at
least, I suppose they have to -- they are names, too, after all, and not
just an arbitrary vector), how can it happen that duplicate names might
appear?

The use case: I have a SummerizedExperiment object with gene IDs in the
rownames. Let's say I want to get the value in the meta-data column
"yellowness" for "gene_D".

With en ExpressionSet, I could write:
   fData(es)["gene_D","yellowness"]

With SummerizeExperiment, it has to be:
   mcols(se,use.names=TRUE)["gene_D","yellowness"]

Of course, it's no big deal, but I find it quite clumsy, and I wonder
why it has to be this way.

  Simon



More information about the Bioc-devel mailing list