[Bioc-devel] BSgenome forge file input file restrictions

Hervé Pagès hpages at fhcrc.org
Mon Sep 29 21:23:48 CEST 2014


Hi Florian,

True. These restrictions don't make much sense these days anymore!
Some of them are gone in the devel version of BSgenome. The
BSgenomeForge vignette in devel now says:

   The sequence data must be in a single twoBit file (e.g. musFur1.2bit)
   or in a collection of FASTA files (possibly gzip-compressed).

I guess I should also support a single FASTA file.

H.

On 09/29/2014 01:36 AM, Hahne, Florian wrote:
> Hi all,
> I was wondering whether some of the rather arbitrary restrictions on input files for the process of forging as new Bsgenome package could be liftet. In particular:
>
> Why do we need all chromosomes in individual files? Couldn�t the function be smart enough to just extract the relevant bits from a single file containing all chromosomes? Or even from several such files?
>
> Why are gzipped files not allowed? Pretty much all tools in Biostrings seem to be able to deal with gzipped fasta files these days.
>
> Thanks,
> Florian
>
> 	[[alternative HTML version deleted]]
>
>
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fhcrc.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list