[BioC] I: Help with symbol names mapping between miRecords and BioMart

mauede at alice.it mauede at alice.it
Fri Jul 3 12:57:00 CEST 2009


Please, find attached the crude script that worked.
Regards,
Maura 

-----Messaggio originale-----
Da: mauede at alice.it
Inviato: gio 02/07/2009 8.11
A: Miichael  Watson; Sean  Davis; Steve  Lianoglou
Oggetto: Help with symbol names mapping between miRecords and BioMart
 
I extracted some VALIDATED miRNAs and *hopefully*  I paired them with their respective  VALIDATED genes 3utr sequence.
I am NOT sure about my mapping between BioMart  and miRecords  objects name.
Clearly the output of my algorithm depends upon the correct (is it ?) names mapping.

 > [miR-130a] 
[1] "TAAACTACCTAACATTATTTATTCAGCTTCATTTGTGTCAATGGGCAATGACAGGTAAATTAAGACATGCACTATGAGGAATAATTATTTATTTAATAACAATTGTTTGGGGTTGAAAATTCAAAAAGTGTTTATTTTTCATATTGTGCCAATATGTATTGTAAACATGTGTTTTAATTCCAATATGATGACTCCCTTAAAATAGAAATAAGTGGTTATTTCTCAACAAAGCACAGTGTTAAATGAAATTGTAAAACCTGTCAATGATACAGTCCCTAAAGAAAAAAAATCATTGCTTTGAAGCAGTTGTGTCAGCTACTGCGGAAAAGGAAGGAAACTCCTGACAGTCTTGTGCTTTTCCTATTTGTTTTCATGGTGAAAATGTACTGAGATTTTGGTATTACACTGTATTTGTATCTCTGAAGCATGTTTCATGTTTTGTGACTATATAGAGATGTTTTTAAAAGTTTCAATGTGATTCTAATGTCTTCATTTCATTGTATGATGTGTTGTGATAGCTAACATTTT"


 > hsa-let-7c 
[1] "ATTGTCATTGGAGGAGTCCAGGATAGCTCTTCATGTTATTTTCACCTTGAGGAATTGTCCATTACATCTATGAGCCTTATGTGTGGCTTTCTCCGATATAGAAACCTATCAGGTGTCTTTTAGATCATTTCAAAACACTGGCTTATTCTTTCTTATGTTTCCAACTGAAGTCTGCATCCCAAGATGTAGTTTCACTGCTACCCCATATGGCACCCTTGTACGAATTTGAAAAAAGTACTCACTCTAGGCACATGCAGAGCCATGCCTGCGGGGACAGCTTAGAGAGTAGAGGGTGGGCTGAACTCCAGTTACTCTCGTACAGGGATCCACCTTTTTGCAGAAATCACAGTGTGGCTATGGTGTGGTTTGATTTCATAAAACAGATGCTT"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[2] "TTGCATTTCCTAGGTTTCTGTGTTTGGGGTGTGTGTGCGTGTCTCTCTCTCTCTCTCTCTCTTTCTCTTTCTCTCTCTTTTTGAATTTCAAAGAAGAAACAGTCTCAGGGAAATTTCTTTTTTCTTTTTTTTTTTTAAAGAGAACAAGAAAAGTACAACATTGCTTAAGTCCTACCTCATCTTTATTTTTTTACAGATGAATGTACTTATCTTTTCTGCAGGGATTGAGCCTGTGAAGTGATAATTTCTATCTACCTCATAAATCTTTACATTTCCTTCTGCAACAGGCCCTCTTCCCCTCCTCAGTGGAGTTTGCATTTCCCTCTTCCCCTGCGTGGGGCATGATATGCACAAGCCTGGCATCTGTATGGCTGGGAGGGCACTGGATGTGTGTGGTGGGGTGTATTCTGTAGATTGAGCCAAGGAAACACAAAAAAAAACTACTAAGT"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
[3] "Sequence unavailable"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
[4] "GCCACCCACCTTGGCCTCTCAAAGTGCTGGGAATACAGGCGTGAGCCATCGTGCCTGGTCTAAAAAATGTCTATTAGTGTTCCATCACTAGATCTCTTCTGAGGTATTCATGCCATATGCCCCATCCTGATGTCATATCCACAGGACAATCTACTACCAAGAACCAGCTCCAAGAAGAAAACATCTCTGGGAAACAGTACCAAAAGGAGTCACTGAATTGTCATTGGAGGAGTCCAGGATAGCTCTTCATGTTATTTTCACCTTGAGGAATTGTCCATTACATCTATGAGCCTTATGTGTGGCTTTCTCCGATATAGAAACCTATCAGGTGTCTTTTAGATCATTTCAAAACACTGGCTTATTCTTTCTTATGTTTCCAACTGAAGTCTGCATCCCAAGATGTAGTTTCACTGCTACCCCATATGGCACCCTTGTACGAATTTGAAAAAAGTACTCACTCTAGGCACATGCAGAGCCATGCCTGCGGGGACAGCTTAGAGAGTAGAGGGTGGGCTGAACTCCAGTTACTCTCG"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[5] "GGGGCGCCAACGTTCGATTTCTACCTCAGCAGCAGTTGGATCTTTTGAAGGGAGAAGACACTGCAGTGACCACTTATTCTGTATTGCCATGGTCTTTCCACTTTCATCTGGGGTGGGGTGGGGTGGGGTGGGGGAGGGGGGGGTGGGGTGGGGAGAAATCACATAACCTTAAAAAGGACTATATTAATCACCTTCTTTGTAATCCCTTCACAGTCCCAGGTTTAGTGAAAAACTGCTGTAAACACAGGGGACACAGCTTAACAATGCAACTTTTAATTACTGTTTTCTTTTTTCTTAACCTACTAATAGTTTGTTGATCTGATAAGCAAGAGTGGGCGGGTGAGAAAAACCGAATTGGGTTTAGTCAATCACTGCACTGCATGCAAACAAGAAACGTGTCACACTTGTGACGTCGGGCATTCATATAGGAAGAACGCGGTGTGTAACACTGTGTACACCTCAAATACCACCCCAACCCACTCCCTGTAGTGAATCCTCTGTTTAGAACACCAAAGATAAGGACTAGATACTACTTTCTCTTTTTCGTATAATCTTGTAGACACTTACTTGATGATTTTTAACTTTTTATTTCTAAATGAGACGAAATGCTGATGTATCCTTTCATTCAGCTAACAAACTAGAAAAGGTTATGTTCATTTTTCAAAAAGGGAAGTAAGCAAACAAATATTGCCAACTCTTCTATTTATGGATATCACACATATCAGCAGGAGTAATAAATTTACTCACAGCACTTGTTTTCAGGACAACACTTCATTTTCAGGAAATCTACTTCCTACAGAGCCAAAATGCCATTTAGCAATAAATAACACTTGTCAGCCTCAGAGCATTTAAGGAAACTAGACAAGTAAAATTATCCTCTTTGTAATTTAATGAAAAGGTACAACAGAATAATGCATGATGAACTCACCTAATTATGAGGTGGGAGGAGCGAAATCTAAATTTCTTTTGCTATAGTTATACATCAATTTAAAAAGCAAAAAAAAAAAAGGGGGGGGCAATCTCTCTCTGTGTCTTTCTCTCTCTCTCTTCCTCTCCCTCTCTCTTTTCATTGTGTATCAGTTTCCATGAAAGACCTGAATACCACTTACCTCAAATTAAGCATATGTGTTACTTCAAGTAATACGTTTTGACATAAGATGGTTGACCAAGGTGCTTTTCTTCGGCTTGAGTTCACCATCTCTTCATTCAAACTGCACTTTTAGCCAGAGATGCAATATATCCCCACTACTCAATACTACCTCTGAATGTTACAACGAATTTACAGTCTAGTACTTATTACATGCTGCTATACACAAGCAATGCAAGAAAAAAACTTACTGGGTAGGTGATTCTAATCATCTGCAGTTCTTTTTGTACACTTAATTACAGTTAAAGAAGCAATCTCCTTACTGTGTTTCAGCATGACTATGTATTTTTCTATGTTTTTTTAATTAAAAATTTTTAAAATACTTGTTTCAGCTTCTCTGCTAGATTTCTACATTAACTTGAAAATTTTTTAACCAAGTCGCTCCTAGGTTCTTAAGGATAATTTTCCTCAATCACACTACACATCACACAAGATTTGACTGTAATATTTAAATATTACCCTCCAAGTCTGTACCTCAAATGAATTCTTTAAGGAGATGGACTAATTGACTTGCAAAGACCTACCTCCAGACTTCAAAAGGAATGAACTTGTTACTTGCAGCATTCATTTGTTTTTTCAATGTTTGAAATAGTTCAAACTGCAGCTAACCCTAGTCAAAACTATTTTTGTAAAAGACATTTGATAGAAAGGAACACGTTTTTACATACTTTTGCAAAATAAGTAAATAATAAATAAAATAAAAGCCAACCTTCAAAGAAACTTGAAGCTTTGTAGGTGAGATGCAACAAGCCCTGCTTTTGCATAATGCAATCAAAAATATGTGTTTTTAAGATTAGTTGAATATAAGAAAATGCTTGACAAATATTTTCATGTATTTTACACAAATGTGATTTTTGTAATATGTCTCAACCAGATTTATTTTAAACGCTTCTTATGTAGAGTTTTTATGCCTTTCTCTCCTAGTGAGTGTGCTGACTTTTTAACATGGTATTATCAACTGGGCCAGGAGGTAGTTTCTCATGACGGCTTTTGTCAGTATGGCTTTTAGTACTGAAGCCAAATGAAACTCAAAACCATCTCTCTTCCAGCTGCTTCAGGGAGGTAGTTTCAAAGGCCACATACCTCTCTGAGACTGGCAGATCGCTCACTGTTGTGAATCACCAAAGGAGCTATGGAGAGAATTAAAACTCAACATTACTGTTAACTGTGCGTTAAATAAGCAAATAAACAGTGGCTCATAAAAATAAAAGTCGCATTCCATATCTTTGGATGGGCCTTTTAGAAACCTCATTGGCCAGCTCATAAAATGGAAGCAATTGCTCATGTTGGCCAAACATGGTGCACCGAGTGATTTCCATCTCTGGTAAAGTTACACTTTTATTTCCTGTATGTTGTACAATCAAAACACACTACTACCTCTTAAGTCCCAGTATACCTCATTTTTCATACTGAAAAAAAAAGCTTGTGGCCAATGGAACAGTAAGAACATCATAAAATTTTTATATATATAGTTTATTTTTGTGGGAGATAAATTTTATAGGACTGTTCTTTGCTGTTGTTGGTCGCAGCTACATAAGACTGGACATTTAACTTTTCTACCATTTCTGCAAGTTAGGTATGTTTGCAGGAGAAAAGTATCAAGACGTTTAACTGCAGTTGACTTTCTCCCTGTTCCTTTGAGTGTCTTCTAACTTTATTCTTTGTTCTTTATGTAGAATTGCTGTCTATGATTGTACTTTGAATCGCTTGCTTGTTGAAAATATTTCTCTAGTGTATTATCACTGTCTGTTCTGCACAATAAACATAACAGCCTCTGTGATCCCCATGTGTTTTGATTCCTGCTCTTTGTTACAGTTCCATTAAATGAGTAATAAAGTTTGGTCAAAAC"


I downloaded the VALIDATED xls file from miRecords and discarded the records that do not pertain to Homo Sapiens. I also
dropped some columns that, as far as I can tell, do not carry data relevant for my goal.
Then throug BioMart functions  I extracted the following fields: 'hgnc_symbol','ensembl_gene_id','external_gene_id','refseq_dna'.
The filters I used assume that 
 BioMart data named  "hgnc_automatic_gene_name" is what miRecords calls "Target.gene_name" and
 BioMart data named  "refseq_dna" is what miRecords calls "Target.gene_Refseq_acc"

Please, check my objects name mapping and let me know if I go it right / wrong.

There are a few cases which are not dealt with by my algorithm yet.
That is, some records in miRecords xls file contain non standard miRNA identifier that I do not understand. For instance:
"hsa-miR-15a/hsa-miR-16" (what does the "/" mean ?)
"[miR-106b]"   (what do the square brakets mean ?)

Moreover, there are many redundant lines in my script. It is just a starting point.

It maybe possible to get the 3UTR sequences of VALIDATED targets downloading the conjoined  information from miRecords and miRDB ...
but it must be harder because the miRecords organization does not provide any interface library.

I have attached the pruned version of miRecords xls file and my crude script.
I look forward to your feedback.
Thank you a lot,
Maura




e tutti i telefonini TIM!
Vai su 



e tutti i telefonini TIM!
Vai su 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: My_miRec_Validated_Targets.txt
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20090703/b9355dc8/attachment.txt>


More information about the Bioconductor mailing list