[BioC] GoStats and microRNA pipeline using Biomart

David martin vilanew at gmail.com
Wed Mar 30 15:43:01 CEST 2011


Hi,
I open this new discussion so not to confuse with the previous one.

The objective here is to look for overrepresented GoTerms from microRNA 
targets. One microRNA can have several targets (genes)  and one single 
gene can be targeted by several microRNAs. The assumption is to check 
for a specific microRNAs which GoTerms are overrepresented.


Ok so let's say me my microRNA of interest is mir-A.

Step1: based on my favorite prediction algorithm i have managed to get a 
list of genes targeted by mir-A. The genes are ensembl transcripts and 
as i said before miR-A can target several times the same transcript (at 
different location) so i need to account for this.

miR-A targets -> 
ENST001,ENST001,ENST001,ENST0025,ENST089,ENST099,ENST0099......) up to 
300 different transcripts.

I use biomart to get the corresponding GoIds for these transcripts

....
#Select mart database
mart <- useMart("ensembl", dataset="hsapiens_gene_ensembl")

#Get go for a specific transcript
# First problem as Biomart will not return twice GoTerms for duplicated 
transcripts. The example below show that for transcript 
c("ENST00000347770","ENST00000347770") i get the same goTerms than for 
transcript c("ENST00000347770").
# As i said before a microRNA can target several times the same microRNA 
so twice the number of goterms associated to this particular microRNA. 
Can we force biomart to return redundant GoTerms ????

gomir = getBM(attributes=c(
                 'go_biological_process_id',
                 'go_biological_process_linkage_type',
                 'go_cellular_component_linkage_type',
                 'go_cellular_component_id',
                 'go_molecular_function_id',
                 'go_molecular_function_id')
   ,filters="ensembl_transcript_id", 
values=c("ENST00000347770","ENST00000347770"......),  mart=mart)

.... i will complete the rest of the pipiline with GoStats if i get 
clean on that first.



More information about the Bioconductor mailing list