[Bioc-devel] Any consensus around data structures for methylation data?

Hervé Pagès hp@ge@@on@g|thub @end|ng |rom gm@||@com
Fri Dec 11 01:50:30 CET 2020


Many many packages in the methylation business (methylation array or 
MethylSeq) define dozens of data structures (S4 classes) for the purpose 
of representing methylation data. Some of the most outstanding data 
structures seem to be RnBeadSet in RnBeads, MethylSet in minfi, 
MethyLumiSet in methylumi, BSdataSet in methylPipe, BSrel and BSraw in 
BiSeq, etc... Some are SummarizedExperiment derivatives, some are eSet 
derivatives, others don't extend anything and reimplement a lot of 
things from scratch. I wonder if some kind of 
unification/standardization effort has been considered. Would be great 
to achieve some sort of consensus like has been done with 
SingleCellExperiment for single cell data. Having the consensual classes 
implemented in their own package, and separated from any particular 
application like in the case of SCE, would help with exposure and 
reusability.

For the context: I'm currently looking at a new submission (MAGAR) that 
implements yet another set of S4 classes (methQTLInput and 
methQTLResult) almost from scratch:
- 
https://github.com/MPIIComputationalEpigenetics/MAGAR/blob/a5281222426441ed48581cd5b45d1d81d6537ed4/R/methQTLResult-class.R#L38-L57
- 
https://github.com/MPIIComputationalEpigenetics/MAGAR/blob/a5281222426441ed48581cd5b45d1d81d6537ed4/R/methQTLResult-class.R#L38-L57.
I'm hoping that they can avoid that by reusing (either directly or by 
extending) something that's already available in the ecosystem but it's 
not clear to me what to recommend.

Thanks,
H.

-- 
Hervé Pagès

Bioconductor Core Team
hpages.on.github using gmail.com



More information about the Bioc-devel mailing list