[BioC] Get genomic sequences
Paul Leo
p.leo at uq.edu.au
Mon Jan 11 11:15:41 CET 2010
If you have a bed file then all you need are the BSgenome.* packages to
get the sequences ....
library("BSgenome.Mmusculus.UCSC.mm9")
all.genomic<-getSeq(Mmusculus, the.chrom, starts, ends)
where
the.chrom[1:5]
[1] "chr1" "chr1" "chr1" "chr1" "chr1"'
starts[1:5]
3187526 3487463 3777276 4144186 4274111
> ends[1:5]
[1] 3187790 3487763 3777555 4144499 4274416
etc etc...
myb<-"YAACKG"
length(all.genomic)
system.time(x<- XStringViews(all.genomic, "DNAString"))
x.labels<-paste(the.chrom,starts,ends,sep=":")
names(x)<-x.labels
###################### forward counts #################
all.matches<-matchPattern(myb.dna,x,max.mismatch=0, fixed=FALSE) # needs
a stringView to vectorize
the.cov<-coverage(all.matches)
counts<-aggregate(the.cov,start=start(x),end=end(x),FUN=sum)/length(myb.dna)
######################################################
-----Original Message-----
From: Johannes Waage <johannes.waage at bric.dk>
To: bioconductor at stat.math.ethz.ch
Subject: [BioC] Get genomic sequences
Date: Mon, 11 Jan 2010 10:56:21 +0100
Hi all,
Is there a way to fetch genomic sequences via Bioconductor directly? (Using
galaxy, but I would like to automate)
I tried rtracklayer and biomaRt - rtracklayer doesn't seem to have an
interface for fetching sequences, and biomaRt only seems to fetch sequences
from a subset of gene ID's, while I just to need to fetch sequence from a
genomic range.
fetchSequence(chr, strand, start, end) -> sequence
Any suggestions?
Thank you in advance!!
Best regards,
JW,
Uni. of Copenhagen
[[alternative HTML version deleted]]
_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list