[BioC] How to annotate genomic coordinates

Steve Lianoglou mailinglist.honeypot at gmail.com
Tue Nov 13 11:42:46 CET 2012


Hi,

On Tue, Nov 13, 2012 at 5:00 AM, José Luis Lavín
<jluis.lavin at unavarra.es> wrote:
> Dear all,
>
> I tried Valerie's approach but I came across an error I don't really know
> how to fix (I not familiar with R errors yet, but if I keep performing so
> badly I guess I will soon come across quite a few of them in a short
> time...)
>
> I will write the code I tested and the error I got:
>
> #I had to install some libraries from Bioconductor at the beginning
>
> source("http://bioconductor.org/biocLite.R")
> biocLite('Mus.musculus')
> biocLite('TxDb.Mmusculus.UCSC.mm9.knownGene')
> biocLite('VariantAnnotation')
>
>
> library(Mus.musculus)
> library(TxDb.Mmusculus.UCSC.mm9.knownGene)
>
> txdb <- TxDb.Mmusculus.UCSC.mm9.knownGene
>
> #gr <- GRanges(seq = "chr17", IRanges(start = 31245606, width = 20))
>
> y <- read.delim("/path_to_dir/ids.txt", sep=".", header=FALSE, as.is=TRUE)
>
>
> probes<- GRanges(seqnames=y[,1], ranges=IRanges(start=y[,2], width=1))
>
> *#Lets see if ids.txt was read correctly into probes*
>
> head(probes)
>
> 1         chr13  21272514
> 2         chr13  21272519
> 3         chr13  21272525
> 4         chr13  21272533
> 5         chr13  21295151
> 6         chr13  21295172
>
> *#seems correct to me*

This doesn't seem correct to me, actually. Taking the `head` of a
proper GRanges object should print some "GRanges" like stuff around
the data, for instance:

R> probes <- GRanges('chr13', IRanges(c(21272514, 21272519, 21272525), width=1))
R> head(probes)

GRanges with 3 ranges and 0 metadata columns:
      seqnames               ranges strand
         <Rle>            <IRanges>  <Rle>
  [1]    chr13 [21272514, 21272514]      *
  [2]    chr13 [21272519, 21272519]      *
  [3]    chr13 [21272525, 21272525]      *
  ---
  seqlengths:
   chr13
      NA

You see how these two are very different?

You can also check to see what your `probes` object is, like so:

R> class(probes)
[1] "GRanges"
attr(,"package")
[1] "GenomicRanges"

But I'm guessing your `probes` object is actually still a data.frame
(in which case I am a bit surprised that your call to `probes <-
GRanges(y[,1], ...)` didn't result in an error) because:

> library(VariantAnnotation)
>
> loc <- locateVariants(query=y, subject=txdb, region=AllVariants())
>
> *#Error in function (classes, fdef, mtable)  :
> #unable to find an inherited method for function "locateVariants",
> #for signature "data.frame", "TranscriptDb", "AllVariants"*

R is telling you that you are trying to call `locateVariants` on a
query that is a `data.frame` object, and there is no `locateVariants`
function defined like that.

HTH,
-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



More information about the Bioconductor mailing list