[Bioc-sig-seq] Extracting DNA sequences from BSgenome.Mmusculus.UCSC.mm9_1.3.11

Ivan Gregoretti ivangreg at gmail.com
Fri May 29 18:04:11 CEST 2009

Hi Hervé,

> With BSgenome 1.12.1 (release) and 1.13.5 (devel) you can now do:
>  myseqs <- data.frame(
>    chr=c("chrY", "chr1", "chr2", "chr3", "chrY", "chr3", "chr1", "chr1"),
>    start=c(NA, -40, 8510201, 4920301, 30001, 9220500, -2804, -30),
>    end=c(50, NA, 8510220, 4920330, 30011, 9220555, -2801, -11)
>  )
>  library(BSgenome.Mmusculus.UCSC.mm9)
>  > getSeq(Mmusculus, myseqs$chr, myseqs$start, myseqs$end)
>  [7] "ATGA"
> to extract multiple subsequences from multiple chromosomes at once.
> (Note support for NAs and negative start or end.)

So, getSeq is vectorised now. Great. That addresses a very common use of getSeq.

> Hopefully this time you won't get hit by the infamous bug you reported
> earlier (BTW anything new on that front? Were you able to reproduce it?
> Thanks).

Bug? Last time I was in real trouble I solved my problem with
Michael's suggestions on the use of RangedData. But that was a feature
rather than a bug. Bottom line, I stick to RangedData now because it
is relatively easy to manipulate it.

Thank you,


Ivan Gregoretti, PhD
National Institute of Diabetes and Digestive and Kidney Diseases
National Institutes of Health
5 Memorial Dr, Building 5, Room 205.
Bethesda, MD 20892. USA.
Phone: 1-301-496-1592
Fax: 1-301-496-9878

More information about the Bioc-sig-sequencing mailing list