[BioC] ChIPpeakAnno to find peaks nearest to miRNA

Wed Aug 8 04:46:43 CEST 2012

Paolo,

Just to let you know that starting with ChIPpeakAnno 2.5.11, you can pass a
mart object to addGeneIDs instead of orgAnn. Thanks for your input!

Best regards,

Julie 

On 7/30/12 4:51 AM, "Paolo Kunderfranco" <paolo.kunderfranco at gmail.com>
wrote:

> Hello,
> Ok perfect now is working fine,
> Thanks again for your precious help,
> Paolo
> 
> 
> 2012/7/27 Ou, Jianhong <Jianhong.Ou at umassmed.edu>:
>> Hi Paolo,
>> 
>> Because the org database do not contain the info for ENSMUSG00000089245,
>> there will show an error by addGeneIDs.
>> In this case, you'd better use biomaRt to get the annotation, please try,
>> 
>> feature_ids <- unique(annotatedPeak$feature)
>> feature_ids<-feature_ids[!is.na(feature_ids)]
>> feature_ids<-feature_ids[feature_ids!=""]
>> mart<-useMart(biomart="ensembl",dataset="mmusculus_gene_ensembl")
>> IDs2Add<-getBM(attributes=c("ensembl_gene_id","mirbase_transcript_name","mirb
>> ase_id","mirbase_accession","external_gene_id"),filters = "ensembl_gene_id",
>> values = feature_ids, mart=mart)
>> duplicated_ids<-IDs2Add[duplicated(IDs2Add[,"ensembl_gene_id"]),"ensembl_gene
>> _id"]
>> if(length(duplicated_ids)>0){
>>         IDs2Add.duplicated<-IDs2Add[IDs2Add[,"ensembl_gene_id"] %in%
>> duplicated_ids,]
>>         
>> IDs2Add.duplicated<-condenseMatrixByColnames(as.matrix(IDs2Add.duplicated),"e
>> nsembl_gene_id")
>>         IDs2Add<-IDs2Add[!(IDs2Add[,"ensembl_gene_id"] %in% duplicated_ids),]
>>         IDs2Add<-rbind(IDs2Add,IDs2Add.duplicated)
>> }
>> 
>> And then merge the useful information to the annotatedPeak.
>> 
>> If you have any questions, please let me know.
>> 
>> Yours sincerely,
>> 
>> Jianhong Ou
>> 
>> jianhong.ou at umassmed.edu
>> 
>> 
>> On Jul 27, 2012, at 9:57 AM, Zhu, Lihua (Julie) wrote:
>> 
>>> Paolo,
>>> 
>>> Could you please send us a few rows of miRNAs in annotatedPeaks? Thanks!
>>> 
>>> Best regards,
>>> 
>>> Julie
>>> ________________________________________
>>> From: bioconductor-bounces at r-project.org
>>> [bioconductor-bounces at r-project.org] on behalf of Paolo Kunderfranco
>>> [paolo.kunderfranco at gmail.com]
>>> Sent: Friday, July 27, 2012 5:50 AM
>>> To: bioconductor at r-project.org
>>> Subject: [BioC] ChIPpeakAnno to find peaks nearest to miRNA
>>> 
>>> Dear All,
>>> I would like to use ChIPpeakAnno to find peaks nearest to miRNA.
>>> 
>>> I loaded my bed file and created a ranged data, load
>>> mmusculus_gene_ensembl dataset through mart and annotated my peaks,
>>> and it seems ok,
>>> 
>>> test.rangedData = BED2RangedData(test.bed)
>>> mart<-useMart(biomart="ensembl",dataset="mmusculus_gene_ensembl")
>>> Annotation = getAnnotation(mart, featureType="miRNA")
>>> annotatedPeak = annotatePeakInBatch(test.rangedData,
>>> AnnotationData=Annotation)
>>> as.data.frame(annotatedPeak)
>>> 
>>> <factor>            <IRanges> |   <character> <character>
>>> <character>      <numeric>    <numeric>   <character>
>>> MACS_peak_109 ENSMUSG00000089245        1 [54494876, 54496209] |
>>> MACS_peak_109           + ENSMUSG00000089245       54826062
>>> 54826166      upstream
>>> numeric>        <numeric>              <character>
>>> -331186           329853             NearestStart
>>> 
>>> 
>>> Now I would like to add miRNA Id as I already did when I annotated for
>>> TSS, but something goes wrong, any ideas how to solve it?
>>> 
>>> library("org.Mm.eg.db")
>>> b<- addGeneIDs(annotatedPeak,"org.Mm.eg.db",c("symbol"))
>>> Error: No entrez identifier can be mapped by input data based on the
>>> feature_id_type. Please consider to use correct feature_id_type,
>>> orgAnn or annotatedPeak
>>> 
>>> 
>>> Thanks,
>>> 
>>> Paolo
>>> 
>>> 
>>>> traceback()
>>> 2: stop("No entrez identifier can be mapped by input data based on the
>>> feature_id_type.\nPlease consider to use correct feature_id_type,
>>> orgAnn or annotatedPeak\n",
>>>       call. = FALSE)
>>> 1: addGeneIDs(annotatedPeak, "org.Mm.eg.db", c("symbol"))
>>>> sessionInfo()
>>> R version 2.15.0 (2012-03-30)
>>> Platform: i386-pc-mingw32/i386 (32-bit)
>>> 
>>> locale:
>>> [1] LC_COLLATE=Italian_Italy.1252  LC_CTYPE=Italian_Italy.1252
>>> LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C
>>> [5] LC_TIME=Italian_Italy.1252
>>> 
>>> attached base packages:
>>> [1] grid      stats     graphics  grDevices utils     datasets
>>> methods   base
>>> 
>>> other attached packages:
>>> [1] targetscan.Mm.eg.db_0.5.0           BiocInstaller_1.4.7
>>>      org.Mm.eg.db_2.7.1                  ChIPpeakAnno_2.4.0
>>> [5] limma_3.12.1                        org.Hs.eg.db_2.7.1
>>>      GO.db_2.7.1                         RSQLite_0.11.1
>>> [9] DBI_0.2-5                           AnnotationDbi_1.18.1
>>>      BSgenome.Ecoli.NCBI.20080805_1.3.17 BSgenome_1.24.0
>>> [13] GenomicRanges_1.8.7                 Biostrings_2.24.1
>>>      IRanges_1.14.4                      multtest_2.12.0
>>> [17] Biobase_2.16.0                      biomaRt_2.12.0
>>>      BiocGenerics_0.2.0                  gplots_2.11.0
>>> [21] MASS_7.3-19                         KernSmooth_2.23-8
>>>      caTools_1.13                        bitops_1.0-4.1
>>> [25] gdata_2.11.0                        gtools_2.7.0
>>> 
>>> loaded via a namespace (and not attached):
>>> [1] RCurl_1.91-1.1   splines_2.15.0   stats4_2.15.0
>>> survival_2.36-14 tools_2.15.0     XML_3.9-4.1
>>> 
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>