[Bioc-devel] Plans for multi-feature SingleCellExperiment?

Aaron Lun @@ron@t|n@|ong@|un @end|ng |rom gm@||@com
Tue Jan 22 23:54:10 CET 2019


For 10X experiments, the Bioc-devel version of DropletUtils will read in
the additional features as extra rows in the count matrix. This reflects
how they are stored in the 10X output format. The row metadata will
record the nature of the feature.

In some cases it may be desirable to keep all the features together. For
starters, it seems like many of the biases are likely to be shared
(w.r.t. library preparation and capture efficiency), so one could
imagine using the same scaling factors for normalization of both
antibody-based features and endogenous mRNAs. In addition, all of the
scater visualization methods rely on SCE inputs, so if you want to
overlay them with protein marker intensities, they'll need to be in the
same matrix.

If you really need to only use mRNAs or antibody-based features, (i) you
can explicitly subset the SCE based on the rowData, or (ii) pass a
subsetting vector to the various scran/scater/whatever functions to tell
them to only use the specified features. Admittedly, if you're going to
be doing this a lot, it would be more convenient to form a MAE
containing two SCEs so that you only have to pass the SCE you want into
those functions.

To that end I would be willing to entertain a PR to DropletUtils to
create a MAE from an SCE. I'm more reluctant to add an isSpike()-like
function. The rationale behind isSpike() was that spike-ins are constant
across cells (theoretically) and thus a function could use this
information to improve its calculations. It's less clear what
mathematically useful information can be gained from protein markers -
biological info, yes, but nothing that you would use to change your
algorithm.

-A

Steve Lianoglou wrote:
> Comrades,
>
> Sorry if I'm out of the loop and have missed anything obvious.
>
> I was curious what the plans are in the single-cell bioconductor-verse
> to support single cell experiments that produce counts from different
> feature-spaces, such as those produced by CITE-seq / REAP-seq, for
> instance.
>
> In these types of experiments, I'm pretty sure we want the counts
> generated from those "features" (oligo-conjugated Antibodies, for
> instance) to be kept in a separate space than the mRNA counts. I think
> we would most  naturally want to put these in something like an
> `assay()` matrix with a different (rowwise) dimmension than the gene
> count matrix, but that can't work since all matrices in the assay()
> list need to be of the same dimensions.
>
> Another option might be to just add them as rows to the assay
> matrices, but keep some type of feature space meta-information akin to
> what `isSpike()` currently does;
>
> or add a new slot to SingleCellExperiment to hold counts from
> different feature spaces, perhaps?;
>
> Or rely on something like a MultiAssayExperiment?
>
> Or?
>
> Curious to learn which way you folks are leaning ...
>
> Thanks!
> -steve
>
> ps - sorry if this email came through twice, it was somehow magically
> sent from an email address I don't have access to anymore.
>



More information about the Bioc-devel mailing list