[Bioc-devel] file registry - feedback

Hervé Pagès hpages at fhcrc.org
Tue Mar 11 06:31:59 CET 2014


Hi Val,

I think it would help understand the motivations behind this proposal
if you could give an example of a method where the user cannot supply
a file name but has to create a 'File' (or 'FileList') object first.
And how the file registry proposal below would help.
It looks like you have such an example in the GenomicFileViews package.
Do you think you could give more details?

Thanks,
H.


On 03/10/2014 08:46 PM, Valerie Obenchain wrote:
> Hi all,
>
> I'm soliciting feedback on the idea of a general file 'registry' that
> would identify file types by their extensions. This is similar in spirit
> to FileForformat() in rtracklayer but a more general abstraction that
> could be used across packages. The goal is to allow a user to supply
> only file name(s) to a method instead of first creating a 'File' class
> such as BamFile, FaFile, BigWigFile etc.
>
> A first attempt at this is in the GenomicFileViews package
> (https://github.com/Bioconductor/GenomicFileViews). A registry (lookup)
> is created as an environment at load time:
>
> .fileTypeRegistry <- new.env(parent=emptyenv()
>
> Files are registered with an information triplet consisting of class,
> package and regular expression to identify the extension. In
> GenomicFileViews we register FaFileList, BamFileList and BigWigFileList
> but any 'File' class can be registered that has a constructor of the
> same name.
>
> .onLoad <- function(libname, pkgname)
> {
>      registerFileType("FaFileList", "Rsamtools", "\\.fa$")
>      registerFileType("FaFileList", "Rsamtools", "\\.fasta$")
>      registerFileType("BamFileList", "Rsamtools", "\\.bam$")
>      registerFileType("BigWigFileList", "rtracklayer", "\\.bw$")
> }
>
> The makeFileType() helper creates the appropriate class. This function
> is used behind the scenes to do the lookup and coerce to the correct
> 'File' class.
>
>  > makeFileType(c("foo.bam", "bar.bam"))
> BamFileList of length 2
> names(2): foo.bam bar.bam
>
> New types can be added at any time with registerFileType():
>
> registerFileType(NewClass, NewPackage, "\\.NewExtension$")
>
>
> Thoughts:
>
> (1) If this sounds generally useful where should it live? rtracklayer,
> GenomicFileViews or other? Alternatively it could be its own lightweight
> package (FileRegister) that creates the registry and provides the
> helpers. It would be up to the package authors that depend on
> FileRegister to register their own files types at load time.
>
> (2) To avoid potential ambiguities maybe searching should be by regex
> and package name. Still a work in progress.
>
>
> Valerie
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list