[Bioc-devel] SingleCellExperiment refactoring
Aaron Lun
|n||n|te@monkey@@w|th@keybo@rd@ @end|ng |rom gm@||@com
Mon Jul 22 06:55:11 CEST 2019
Dear list,
We are planning to modify the SingleCellExperiment class to better
accommodate alternative feature sets from CITE-seq and Perturb-seq
experiments. A new "altExps" concept has been added to store
experimental data for alternative feature sets as nested
SummarizedExperiment instances within a SingleCellExperiment. This aims
to provide a flexible and lightweight approach to storing multiple
Experiments without requiring major changes to user workflows when only
the main feature set (i.e., endogenous genes) is of interest.
The "altExps" concept can also be extended to storage of spike-in
transcripts. In fact, it is more convenient than the current "isSpike"
approach, as the latter requires subsetting to remove the spike-ins
prior to performing gene-only operations on the expression matrix (e.g.,
clustering). For this reason, we are planning to deprecate the "isSpike"
functionality for marking rows as spike-ins. This will be replaced with
the more general "SingleCellExperiment::splitSCEByAlt" function, which
splits a SCE into a main SCE and nested alternative SCEs for minority
features like spike-in transcripts, antibody or CRISPR tags, etc.
These proposed changes are expected to have the following effects on
packages downstream of SingleCellExperiment:
- No change is required for packages that do not use spike-in
information or multiple size factor settings.
- Packages using spike-in transcripts via "isSpike" should switch to
"altExps" to retrieve spike-in data, with experiment-specific size
factors to perform spike-in-specific normalization.
- Packages using other features (e.g., antibody tags) should consider
using "altExps" to retrieve/store this data.
More technical details can be found in the discussion at
https://github.com/drisso/SingleCellExperiment/pull/32, which also
contains a testable implementation of the proposed change. Comments and
other feedback on the proposed plan should be directed there.
-A
More information about the Bioc-devel
mailing list