[BioC] (BSgenome) forgeBSgenomeDataPkg for Sus scrofa problem
Elisabetta Manduchi
manduchi at pcbi.upenn.edu
Mon Oct 10 22:11:47 CEST 2011
Thanks for the response. I guess I hadn't fully understood this seed file
field. I've now set it to 4 (since I have gap.txt, plus chromosomal RM,
TRF files and no since files are needed for AMB masks) and the function is
running now.
Elisabetta
---
On Mon, 10 Oct 2011, Hervé Pagès wrote:
> Hi Elisabetta,
>
> Handling of missing nmask_per_seq field was broken (should have been
> set to 0 when missing). I just fixed this in BSgenome release (1.20.1)
> and devel (1.21.7). Anyway, in your case, it seems like you *do* have
> masks, so you need to have the nmask_per_seq field explicitly set
> to a non-zero value in your seed file. For example, if you have the 4
> "standard" masks:
>
> nmask_per_seq: 4
>
> You can look at the seed file for hg19 in the BSgenome package
> (BSgenome/inst/extdata/GentlemanLab/BSgenome.Hsapiens.UCSC.hg19-seed)
> for an example.
>
> Please let me know if you have further questions about this.
>
> Cheers,
> H.
>
>
> On 11-10-07 11:21 AM, Elisabetta Manduchi wrote:
>>
>> Hello,
>> I'm trying to build a data package for Sus scrofa with BSgenome (R
>> version 2.13.2 and BSgenome version 1.20.0).
>> At the bottom of this email I've copied my seed file.
>> I've downloaded the sequence files from UCSC and checked the md5sums.
>> I've also downloaded the gap.txt and masks files (chr*.fa.out and
>> chr*.bed) from UCSC (but no md5sums were provided).
>> I've followed the instructions from
>> http://bioconductor.org/packages/2.8/bioc/vignettes/BSgenome/inst/doc/BSgenomeForge.pdf
>>
>> and I'm getting the following error
>>
>> ---
>>> forgeBSgenomeDataPkg("./BSgenome.Sscrofa.UCSC.susScr2-seed")
>> Error in forgeBSgenomeDataPkg(y, seqs_srcdir = seqs_srcdir, masks_srcdir
>> = masks_srcdir, :
>> values for symbols NMASKPERSEQ are not single strings
>> ---
>>
>> Can you advice on what the problem might be?
>> Thanks,
>> Elisabetta
>>
>>
>> *SEED file BSgenome.Sscrofa.UCSC.susScr2-seed*
>>
>> Package: BSgenome.Sscrofa.UCSC.susScr2
>> Title: Sus scrofa (Pig) full genome (UCSC version susScr2)
>> Description: Sus scrofa (Pig) full genome as provided by UCSC (susScr2,
>> Nov. 2009)
>> Version: 0.1-0
>> Author: Elisabetta Manduchi <manduchi at pcbi.upenn.edu>
>> Maintainer: Elisabetta Manduchi <manduchi at pcbi.upenn.edu>
>> License: GPL-3
>> organism: Sus scrofa
>> species: Pig
>> provider: UCSC
>> provider_version: susScr2
>> release_date: Nov. 2009
>> release_name: SGSC Sscrofa9.2
>> source_url: http://hgdownload.cse.ucsc.edu/goldenPath/susScr2/
>> organism_biocview: Sus_scrofa
>> BSgenomeObjname: Sscrofa
>> seqnames: paste("chr", c(1:18, "X", "M"), sep="")
>> circ_seqs: "chrM"
>> SrcDataFiles1: sequences: all the chr*.fa.gz files from
>> ftp://hgdownload.cse.ucsc.edu/goldenPath/susScr2/chromosomes/
>> SrcDataFiles2: AGAPS masks: the gap.txt.gz file from
>> http://hgdownload.cse.ucsc.edu/golden
>> Path/susScr2/database/; RM masks:
>> http://hgdownload.cse.ucsc.edu/goldenPath/susScr2/bigZip
>> s/chromOut.tar.gz;TRF masks:
>> http://hgdownload.cse.ucsc.edu/goldenPath/susScr2/bigZips/chr
>> omTrf.tar.gz
>> seqs_srcdir:
>> /mnt/files/cbil/data/cbil/UHTS/Davies/AAvsDT_DNAmethyl/working_dir/MEDIPS/BSgenome.Sscrofa.UCSC.susScr2/seqs
>>
>> masks_srcdir:
>> /mnt/files/cbil/data/cbil/UHTS/Davies/AAvsDT_DNAmethyl/working_dir/MEDIPS/BS
>>
>> genome.Sscrofa.UCSC.susScr2/masks
>> AGAPSfiles_type: gap
>> AGAPSfiles_name: gap.txt
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages at fhcrc.org
> Phone: (206) 667-5791
> Fax: (206) 667-1319
>
More information about the Bioconductor
mailing list