[Bioc-devel] Empty DataFrame Causes SummarizedExperiment Constructor Error
Robert Castelo
robert@c@@te|o @end|ng |rom up|@edu
Thu May 18 08:59:15 CEST 2023
FWIW, it seems to me that the constructor expects the integrity between
the assay data and the column data. giving the correct row names,
there's no error:
SummarizedExperiment(assays = list(counts = countsMini), colData =
DataFrame(row.names=colnames(countsMini)))
class: SummarizedExperiment
dim: 10 10
metadata(0):
assays(1): counts
rownames(10): Gene 1 Gene 2 ... Gene 9 Gene 10
rowData names(0):
colnames(10): Cell 1 Cell 2 ... Cell 9 Cell 10
colData names(0):
not sure whether this is relevant, but I observed that while an empty
base R 'data.frame()' constructor gives zero-length character vectors
for row and column names, the empty 'DataFrame()' constructor gives also
a zero-length character vector for column names, but NULL for row names,
shouldn't this be consistent with base R 'data.frame()'?
dimnames(data.frame())
[[1]]
character(0)
[[2]]
character(0)
dimnames(DataFrame())
[[1]]
NULL
[[2]]
character(0)
robert.
On 5/17/23 20:45, Hervé Pagès wrote:
> Not sure why the colData default is DataFrame(). Seems like this has
> been the default since the birth of the SummarizedExperiment class
> back in 2010 (FWIW the class was born in the GenomicRanges package).
> Anyways, it should probably be NULL, like for rowData. Can you please
> open an issue on GitHub for this? Thanks
>
> H.
>
> On 5/12/23 07:00, Dario Strbenac via Bioc-devel wrote:
>> Good day,
>>
>> The default value of colData is DataFrame(). Not specifying an
>> informative colData is fine.
>>
>> countsMini <- matrix(rpois(100, 100), ncol = 10)
>> colnames(countsMini) <- paste("Cell", 1:10)
>> rownames(countsMini) <- paste("Gene", 1:10)
>> SummarizedExperiment(assays = list(counts = countsMini)) # Creates
>> the object successfully.
>>
>> But, explicitly specifying an empty DataFrame triggers an error. I
>> don't understand why it is not equivalent to the constructor's default.
>>
>> SummarizedExperiment(assays = list(counts = countsMini), colData =
>> DataFrame())
>> Error in `rownames<-`(`*tmp*`, value =
>> .get_colnames_from_first_assay(assays)) :
>> invalid rownames length
>>
>> What is the subtle difference? It also seems like there could be a
>> clearer error message emitted if this is caught in the right place.
>>
>> --------------------------------------
>> Dario Strbenac
>> University of Sydney
>> Camperdown NSW 2050
>> Australia
>> _______________________________________________
>> Bioc-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
--
Robert Castelo, PhD
Associate Professor
Dept. of Medicine and Life Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
More information about the Bioc-devel
mailing list