[BioC] vmatchPDict?
David Iles
D.E.Iles at leeds.ac.uk
Fri Dec 14 12:45:33 CET 2012
Hi,
I need to re-map the probe sequences of the Affymetrix Bovine genome array to a recent draft sequence of the sheep genome (please, don't ask why...). As a first step, I successfully created a new BSgenome package from a seed file, listing individual chromosomes as 'seqnames' and unmapped, and two multiple sequence fasta files as 'mseqnames', as per the forgeBSgenomeDataPkg vignette (see session info below).
When calling the matchPDict() function to map the probe sequences to the + and - strands of individual chromosomes, all went smoothly, but the following error occurred with multiple sequences:
> runAnConScaff(bt.probes.all, outfile="bt.probes.2.oarv3.1.unmapped.txt")
Target: strand + of Oar v3.1 sequence unmapped_scaffolds, unmapped_contigs
>>> Finding all hits in strand + of sequence unmapped_scaffolds ...
Error in matchPDict(pdict, subject) :
please use vmatchPDict() when 'subject' is an XStringSet object (multiple sequence)
So, I edited my script to call vmatchPDict() instead, with the following result....
> runAnConScaff(bt.probes.all, outfile="bt.probes.2.oarv3.1.unmapped.txt")
Target: strand + of Oar v3.1 sequence unmapped_scaffolds, unmapped_contigs
>>> Finding all hits in strand + of sequence unmapped_scaffolds ...
Error in .local(pdict, subject, max.mismatch, min.mismatch, with.indels, :
vmatchPDict() is not ready yet, sorry
While I can work around this by splitting the multiple sequences into loads of small fasta files, each with a single sequence, I wondered, will the vmatchPDict() function be ready in the not-too-distant future?
Many thanks
Dr David Iles
School of Biology
University of Leeds
Leeds LS2 9JT
d.e.iles at leeds.ac.uk
> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] BSgenome.Oaries.ISGC.Oarv3.1 BSgenome_1.26.1 Biostrings_2.26.2
[4] GenomicRanges_1.10.5 IRanges_1.16.4 BiocGenerics_0.4.0
loaded via a namespace (and not attached):
[1] parallel_2.15.2 stats4_2.15.2 tools_2.15.2
>
More information about the Bioconductor
mailing list