[BioC] Genome position to miRNA or gene name

Steve Lianoglou mailinglist.honeypot at gmail.com
Tue Jan 20 21:19:44 CET 2009


Hi,

> I have got a set of human genome locations that I have found using
> Biostrings and BSGenome alignment e.g.
>
> seqname	start	        end	        strand	patternID
> chr9	95978065	95978085	+	TGAGGTAGTAGGTTGTATAGT
> chr11	121522487	121522507	-	TGAGGTAGTAGGTTGTATAGT
> chr22	44887296	44887316	+	TGAGGTAGTAGGTTGTATAGT
> chr22	44888235	44888256	+	TGAGGTAGTAGGTTGTGTGGTT
>
> What I would like to know is whether this genome location is within a
> known miRNA or gene.  What would the best way be to go about this?


One way could be to grab the appropriate GTF file for Hsapiens here:

ftp://ftp.ensembl.org/pub/current_gtf/

It's just a tab delimited file with genome annotations. You can just  
collect the lines for miRNA annotations and see if your positions fall  
in the bounds of the known/annotated miRNA's.

For example, here's one:

18   miRNA   exon   38162   38272   .   +   .   gene_id  
"ENSG00000221441" ...

You can also get miRNA data from miRBase (http://microrna.sanger.ac.uk/sequences/ 
) perhaps it's a more complete set of data?). The data file I have  
from them doesn't have chromosome positions, but does have the stem- 
loop sequence, so that would require an intermediate alignment step  
before getting at what you're after.

Probably not the best way, but a way nonetheless. If you're  
comfortable using biomaRt, I'm sure there's a way to pull down the  
same annotation info and do the comparison from there.

Sorry ... no R code, but hopefully it's helpful nonetheless :-)

-steve

--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University

http://cbio.mskcc.org/~lianos



More information about the Bioconductor mailing list