[Bioc-devel] SingleCellExperiment refactoring

Aaron Lun |n||n|te@monkey@@w|th@keybo@rd@ @end|ng |rom gm@||@com
Mon Jul 22 06:55:11 CEST 2019

Dear list,

We are planning to modify the SingleCellExperiment class to better 
accommodate alternative feature sets from CITE-seq and Perturb-seq 
experiments. A new "altExps" concept has been added to store 
experimental data for alternative feature sets as nested 
SummarizedExperiment instances within a SingleCellExperiment. This aims 
to provide a flexible and lightweight approach to storing multiple 
Experiments without requiring major changes to user workflows when only 
the main feature set (i.e., endogenous genes) is of interest.

The "altExps" concept can also be extended to storage of spike-in 
transcripts. In fact, it is more convenient than the current "isSpike" 
approach, as the latter requires subsetting to remove the spike-ins 
prior to performing gene-only operations on the expression matrix (e.g., 
clustering). For this reason, we are planning to deprecate the "isSpike" 
functionality for marking rows as spike-ins. This will be replaced with 
the more general "SingleCellExperiment::splitSCEByAlt" function, which 
splits a SCE into a main SCE and nested alternative SCEs for minority 
features like spike-in transcripts, antibody or CRISPR tags, etc.

These proposed changes are expected to have the following effects on 
packages downstream of SingleCellExperiment:

- No change is required for packages that do not use spike-in 
information or multiple size factor settings.
- Packages using spike-in transcripts via "isSpike" should switch to 
"altExps" to retrieve spike-in data, with experiment-specific size 
factors to perform spike-in-specific normalization.
- Packages using other features (e.g., antibody tags) should consider 
using "altExps" to retrieve/store this data.

More technical details can be found in the discussion at 
https://github.com/drisso/SingleCellExperiment/pull/32, which also 
contains a testable implementation of the proposed change. Comments and 
other feedback on the proposed plan should be directed there.


More information about the Bioc-devel mailing list