[BioC] Get genomic sequences

Paul Leo p.leo at uq.edu.au
Mon Jan 11 11:15:41 CET 2010



If you have a bed file then all you need are the BSgenome.* packages to
get the sequences ....

library("BSgenome.Mmusculus.UCSC.mm9")
all.genomic<-getSeq(Mmusculus, the.chrom, starts, ends)

where
the.chrom[1:5]
[1] "chr1" "chr1" "chr1" "chr1" "chr1"'
starts[1:5]
3187526 3487463 3777276 4144186 4274111
> ends[1:5]
[1] 3187790 3487763 3777555 4144499 4274416


etc etc...
myb<-"YAACKG"
length(all.genomic)
system.time(x<- XStringViews(all.genomic, "DNAString"))
x.labels<-paste(the.chrom,starts,ends,sep=":")
names(x)<-x.labels
###################### forward counts #################
all.matches<-matchPattern(myb.dna,x,max.mismatch=0, fixed=FALSE) # needs
a stringView to vectorize
the.cov<-coverage(all.matches)
counts<-aggregate(the.cov,start=start(x),end=end(x),FUN=sum)/length(myb.dna)
######################################################

-----Original Message-----
From: Johannes Waage <johannes.waage at bric.dk>
To: bioconductor at stat.math.ethz.ch
Subject: [BioC] Get genomic sequences
Date: Mon, 11 Jan 2010 10:56:21 +0100

Hi all,

Is there a way to fetch genomic sequences via Bioconductor directly? (Using
galaxy, but I would like to automate)

I tried rtracklayer and biomaRt - rtracklayer doesn't seem to have an
interface for fetching sequences, and biomaRt only seems to fetch sequences
from a subset of gene ID's, while I just to need to fetch sequence from a
genomic range.

fetchSequence(chr, strand, start, end) -> sequence

Any suggestions?

Thank you in advance!!

Best regards,
JW,
Uni. of Copenhagen

	[[alternative HTML version deleted]]

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list