[Bioc-devel] transitioning scater/scran to SingleCellExperiment

Angerer, Philipp philipp.angerer at helmholtz-muenchen.de
Tue Aug 8 09:59:56 CEST 2017

Hi Aaron, 

I guess this would be a question for the SummarizedExperiment developers, though personally, I never liked ExpressionSet's inclination to slap names on everything. 

Too bad we’re bound to SummarizedExperiment’s “rows” and “cols”. Since they always refer to features and samples, respectively: Why not name them that? 

There’s already too many APIs in too many programming languages that confusingly have one or the other convention – if whe know which is which, why not name them after that knowledge? 

It probably wouldn't be a good idea to store distances as expression matrices. However, if there is a need for it, we can add a new slot for distance matrices. I think SC3 has a similar requirement, so perhaps this would be more generally useful than I first thought. You can post an issue on the github repository to remind Davide or me to do it. 

Distance matrices (cell×cell) can’t only come from cell×gene matrices. You can e.g. use dynamic time warping to create them from cell×gene×time arrays. 

Finally, I'm not sure what advantages those ergonomics provide. Indeed, if every package defines its own plot() S4 method for SingleCellExperiment, they will clobber each other in the dispatch table, resulting in some interesting results dependent on package loading order. If you have destiny-specific data and methods, best to keep them separate rather than stuffing them into the SCE object. 

I wrote that I could e.g. create a plot_dm method, which plots a diffusion map stored in a SCE. 

Also I didn’t mean the plot method with ergonomics. I meant fortify , names , $ , and [[ . Those would be very useful, as you could just do things like the following, and have autocompletion: 
sce$Predicate1 <- sce$SampleMeta1 > 40 # `$` accesses counts (by gene) and rowData. `$<-` sets rowData 
qplot(Gene1, Gene2, colour = Predicate1, data = sce) # fortify creates a data.frame containing cbind(t(counts), rowData) 

Just as you can do now with DiffusionMap objects. 

Also I’m not sure if i got rowData and the “t” right in the above code ;) I meant cbind(counts as cell×gene, sampleMeta as cell×n_meta) 



Helmholtz Zentrum Muenchen

Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH)

Ingolstaedter Landstr. 1

85764 Neuherberg


Aufsichtsratsvorsitzende: MinDir'in Baerbel Brumme-Bothe

Geschaeftsfuehrer: Prof. Dr. Guenther Wess, Heinrich Bassler, Dr. Alfons Enhsen

Registergericht: Amtsgericht Muenchen HRB 6466

USt-IdNr: DE 129521671

	[[alternative HTML version deleted]]

More information about the Bioc-devel mailing list