[BioC] How to retrieve exon ID and gene ID from exon coordinates?
James W. MacDonald
jmacdon at uw.edu
Mon Sep 10 23:29:24 CEST 2012
Hi Ying,
On 9/10/2012 4:54 PM, ying chen wrote:
>
>
> Hi guys, I have a RNASeq data table which has exon cooridinates (chrom, start. end) and raw count. I want to use DEXseq to see differential transcripts. To do it I need to get geneIDs and exonIDs from corresponding exon cooridinates. Any suggestion how to do it? Thanks a lot for the help!
You don't give much to go on. Assuming you are working with a common
species, it is simple. Let's assume you are working with mice.
Something like this should work:
yourdata <- read.table("yourdata.txt", stringsAsFactors=FALSE)
library(TxDb.Mmusculus.UCSC.mm9.knownGene)
ex <- exons(TxDb.Mmusculus.UCSC.mm9.knownGene, columns =
c("exon_id","gene_id"))
yourdata <- GRanges(yourdata$chrom, IRanges(start=yourdata$start,
end=yourdata$end))
elementMetadata(yourdata) <- elementMetadata(ex)[match(yourdata, ex),]
If you are planning on doing this sort of stuff, do yourself a favor and
read the GenomicFeatures and GenomicRanges vignettes. They are chock
full of info that you will need.
Best,
Jim
> Ying
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list