[BioC] insert Ns for repeat masked regions
Steve Lianoglou
mailinglist.honeypot at gmail.com
Wed Mar 2 04:03:05 CET 2011
Hi,
On Tue, Mar 1, 2011 at 5:50 PM, rna seq <rna.seeker at gmail.com> wrote:
> Hello List,
>
> I am trying to retrieve a sequence of ~1000 nts using the getseq() function
> from the BSgenomes package
>
> I would like to replace the repeat masked regions with Ns
>
> using something similar to the inject snps function from the
> SNPlocs.Hsapiens.dbSNP package.
>
> So far I can grab sequence from the genome either masked: getSeq(Hsapiens,
> "chr21", 33665196, 33665435, as.character=FALSE)
>
> or unmasked: getSeq(hg19snp, "chr21", 33665196, 33665435,
> as.character=FALSE)
>
> The problem is that the masked function returns a gap:
>
> TCCCAGGATGTGACATTGTTTGCCAGTGCAGAGGC...GGAGCTTTGGAAGAAGAGAGAGTTGACTACGGAAA
>
> and I would like the gap to be filled with Ns?
I'm not sure that there is a gap there, as the middle '...' is just a
result of how XString objects "show" themselves in R.
Is that what you're talking about?
Look at the result you get when you set as.character=TRUE
R> library(BSgenome.Hsapiens.UCSC.hg19)
R> getSeq(Hsapiens, "chr21", 33665196, 33665435, as.character=TRUE)
-steve
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the Bioconductor
mailing list