[Bioc-devel] Empty DataFrame Causes SummarizedExperiment Constructor Error

Robert Castelo robert@c@@te|o @end|ng |rom up|@edu
Thu May 18 08:59:15 CEST 2023


FWIW, it seems to me that the constructor expects the integrity between 
the assay data and the column data. giving the correct row names, 
there's no error:

SummarizedExperiment(assays = list(counts = countsMini), colData = 
DataFrame(row.names=colnames(countsMini)))
class: SummarizedExperiment
dim: 10 10
metadata(0):
assays(1): counts
rownames(10): Gene 1 Gene 2 ... Gene 9 Gene 10
rowData names(0):
colnames(10): Cell 1 Cell 2 ... Cell 9 Cell 10
colData names(0):

not sure whether this is relevant, but I observed that while an empty 
base R 'data.frame()' constructor gives zero-length character vectors 
for row and column names, the empty 'DataFrame()' constructor gives also 
a zero-length character vector for column names, but NULL for row names, 
shouldn't this be consistent with base R 'data.frame()'?

dimnames(data.frame())
[[1]]
character(0)

[[2]]
character(0)

dimnames(DataFrame())
[[1]]
NULL

[[2]]
character(0)

robert.

On 5/17/23 20:45, Hervé Pagès wrote:
> Not sure why the colData default is DataFrame(). Seems like this has 
> been the default since the birth of the SummarizedExperiment class 
> back in 2010 (FWIW the class was born in the GenomicRanges package). 
> Anyways, it should probably be NULL, like for rowData. Can you please 
> open an issue on GitHub for this? Thanks
>
> H.
>
> On 5/12/23 07:00, Dario Strbenac via Bioc-devel wrote:
>> Good day,
>>
>> The default value of colData is DataFrame(). Not specifying an 
>> informative colData is fine.
>>
>> countsMini <- matrix(rpois(100, 100), ncol = 10)
>> colnames(countsMini) <- paste("Cell", 1:10)
>> rownames(countsMini) <- paste("Gene", 1:10)
>> SummarizedExperiment(assays = list(counts = countsMini)) # Creates 
>> the object successfully.
>>
>> But, explicitly specifying an empty DataFrame triggers an error. I 
>> don't understand why it is not equivalent to the constructor's default.
>>
>> SummarizedExperiment(assays = list(counts = countsMini), colData = 
>> DataFrame())
>> Error in `rownames<-`(`*tmp*`, value = 
>> .get_colnames_from_first_assay(assays)) :
>>    invalid rownames length
>>
>> What is the subtle difference? It also seems like there could be a 
>> clearer error message emitted if this is caught in the right place.
>>
>> --------------------------------------
>> Dario Strbenac
>> University of Sydney
>> Camperdown NSW 2050
>> Australia
>> _______________________________________________
>> Bioc-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
-- 
Robert Castelo, PhD
Associate Professor
Dept. of Medicine and Life Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514



More information about the Bioc-devel mailing list