[Bioc-devel] SummarizedExperiment with alternate back end

Tim Triche, Jr. tim.triche at gmail.com
Fri Sep 18 22:56:09 CEST 2015


bigmemoryExtras (Peter Haverty's extensions to bigMemory/bigMatrix) can be
handy for this, as it works well as a backend, especially if you go about
splitting by chromosome as for CNV segmentation, DMR finding, etc.   It's
not as seamless as one might like, but it's the closest thing I've found.

SciDb tries to implement a similar API, but for a distributed version of
this where the data itself is in a columnar database and served on demand.
I tried getting that up and running as a SummarizedExperiment backend, but
did not succeed.  I have previously shoveled all of the TCGA 450k data into
one 7,000+ column bigMatrix which serializes to about 14GB on disk.

If you have any replicates in your 700+ samples, it's a good idea to keep
their SNP calls in metadata(yourSE), although if you change names it needs
to propagate into the dependent metadata.  This is why I started monkeying
around with linkedExperiments where those mappings are enforced; it's
becoming more of an issue with the TARGET pediatric AML study, where there
are numerous diagnosis-remission-relapse trios whose identity I wish to
verify periodically.  The SNPs on the 450k array are great for this
purpose, but minfi doesn't really have a slot for them per se, so live in
metadata().


--t

On Fri, Sep 18, 2015 at 1:29 PM, Vincent Carey <stvjc at channing.harvard.edu>
wrote:

> i am dealing with ~700 450k arrays
>
> they are derived from one study, so it makes sense to think of
>
> them holistically.
>
> both the load time and the memory consumption are not satisfactory.
>
> has anyone worked on an object type that implements the rangedSE API but
> has
>
> the assay data out of memory?
>
> > unix.time(load("wbmse.rda"))
>
>    user  system elapsed
>
>  30.131   2.396  61.036
>
> > object.size(wbmse)
>
> 124031032 bytes
>
> > dim(wbmse)
>
> [1] 485577    690
>
> > object.size(assays(wbmse))
>
> 2680430992 bytes
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list