[Bioc-devel] Modeling (statistic, p-value) pairs in MultiAssayExperiment

Tue Oct 24 15:43:08 CEST 2017

Thank you!

Fig 1 shows the pipeline for a single database of pathways, but we
used 10 different databases (GO, KEGG, Reactome...). Currently we use
all of MSigDB, which includes 24 subcategories, and we have a matrix
of ES and a matrix of pvalues for each. You always have the same drugs
over columns, but different pathways over rows. Keeping them separated
is necessary (you don't want to rank pathways across unrelated
databases). On the other hand, if I build one SummarizedExperiment for
each database, I have to replicate the common metadata across all of
them, and also lose most of the features that going through the burden
of modeling my data with SE were all about :-/.

Note I'm considering all this for a package under review to possibly
improve its interoperability with existing packages.

On Tue, Oct 24, 2017 at 2:45 PM, Levi Waldron
<lwaldron.research at gmail.com> wrote:
> On Oct 24, 2017 6:14 AM, "Francesco Napolitano" <franapoli at gmail.com> wrote:
>
> I'm converting gene expression profiles to "pathway expression
> profiles" (https://doi.org/10.1093/bioinformatics/btv536), so for each
> pathway I have an enrichment score and a p-value. I guess it would be
> like modeling gene expression data where limma-like preprocessing was
> performed, so you have a fold change - p-value pair for each gene.
> Isn't there a data model for that?
>
>
> Nice paper, thanks for the link! Could you explain the problem a little more
> using the terminology of your paper? I see your enrichment values matrix
> (fig 1c ESij) of pathways x cell lines, and imagine additional associated
> matrices of p-values and ranks, but where do assays with different rows come
> in?
>