[BioC] miRNA annotation
mailinglist.honeypot at gmail.com
Mon May 7 20:59:20 CEST 2012
On Mon, May 7, 2012 at 2:15 PM, Wu, Huiyun <william.wu at vanderbilt.edu> wrote:
> Hello there,
> I am new in seq data analysis, and have difficulty in miRNA annotation. To clarify my question, I have the following codes.
> bamfile <- "GSE33858_1.bowtie_alignment.bam"
> aln <- readBamGappedAlignments(bamfile)
> exonRangesList <- exonsBy(txdb, by = "gene")
> exonRanges <- unlist(exonRangesList)
> strand(exonRanges) <- "*"
> geneNames <- sub("\\..*$", "", names(exonRanges))
> exonRangesListNoStrand <- split(exonRanges, geneNames)
> numberOfexons <- sapply(width(exonRangesListNoStrand), length)
> geneLength <- sum(width(reduce(exonRangesListNoStrand)))
> # Counting reads for the BAM file
> counts <- countOverlaps(exonRangesListNoStrand, aln)
> names(counts) <- names(exonRangesListNoStrand)
> > length(counts)
>  22932
> So my question is from aligned file to annotated file. Specifically, why I got 22932 features after mapping to hg19? I learned there are only less than 2000 miRNAs matched genes. I also know there are several RNA databases including miRBase for annotation. Did I do something logically wrong? It would be greatly appreciated if someone could show me how to implement this annotation in R.
What does `sum(counts == 0)` give you?
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the Bioconductor