[BioC] ChIPpeakAnno to find peaks nearest to miRNA
Zhu, Lihua (Julie)
Julie.Zhu at umassmed.edu
Wed Aug 8 04:46:43 CEST 2012
Paolo,
Just to let you know that starting with ChIPpeakAnno 2.5.11, you can pass a
mart object to addGeneIDs instead of orgAnn. Thanks for your input!
Best regards,
Julie
On 7/30/12 4:51 AM, "Paolo Kunderfranco" <paolo.kunderfranco at gmail.com>
wrote:
> Hello,
> Ok perfect now is working fine,
> Thanks again for your precious help,
> Paolo
>
>
> 2012/7/27 Ou, Jianhong <Jianhong.Ou at umassmed.edu>:
>> Hi Paolo,
>>
>> Because the org database do not contain the info for ENSMUSG00000089245,
>> there will show an error by addGeneIDs.
>> In this case, you'd better use biomaRt to get the annotation, please try,
>>
>> feature_ids <- unique(annotatedPeak$feature)
>> feature_ids<-feature_ids[!is.na(feature_ids)]
>> feature_ids<-feature_ids[feature_ids!=""]
>> mart<-useMart(biomart="ensembl",dataset="mmusculus_gene_ensembl")
>> IDs2Add<-getBM(attributes=c("ensembl_gene_id","mirbase_transcript_name","mirb
>> ase_id","mirbase_accession","external_gene_id"),filters = "ensembl_gene_id",
>> values = feature_ids, mart=mart)
>> duplicated_ids<-IDs2Add[duplicated(IDs2Add[,"ensembl_gene_id"]),"ensembl_gene
>> _id"]
>> if(length(duplicated_ids)>0){
>> IDs2Add.duplicated<-IDs2Add[IDs2Add[,"ensembl_gene_id"] %in%
>> duplicated_ids,]
>>
>> IDs2Add.duplicated<-condenseMatrixByColnames(as.matrix(IDs2Add.duplicated),"e
>> nsembl_gene_id")
>> IDs2Add<-IDs2Add[!(IDs2Add[,"ensembl_gene_id"] %in% duplicated_ids),]
>> IDs2Add<-rbind(IDs2Add,IDs2Add.duplicated)
>> }
>>
>> And then merge the useful information to the annotatedPeak.
>>
>> If you have any questions, please let me know.
>>
>> Yours sincerely,
>>
>> Jianhong Ou
>>
>> jianhong.ou at umassmed.edu
>>
>>
>> On Jul 27, 2012, at 9:57 AM, Zhu, Lihua (Julie) wrote:
>>
>>> Paolo,
>>>
>>> Could you please send us a few rows of miRNAs in annotatedPeaks? Thanks!
>>>
>>> Best regards,
>>>
>>> Julie
>>> ________________________________________
>>> From: bioconductor-bounces at r-project.org
>>> [bioconductor-bounces at r-project.org] on behalf of Paolo Kunderfranco
>>> [paolo.kunderfranco at gmail.com]
>>> Sent: Friday, July 27, 2012 5:50 AM
>>> To: bioconductor at r-project.org
>>> Subject: [BioC] ChIPpeakAnno to find peaks nearest to miRNA
>>>
>>> Dear All,
>>> I would like to use ChIPpeakAnno to find peaks nearest to miRNA.
>>>
>>> I loaded my bed file and created a ranged data, load
>>> mmusculus_gene_ensembl dataset through mart and annotated my peaks,
>>> and it seems ok,
>>>
>>> test.rangedData = BED2RangedData(test.bed)
>>> mart<-useMart(biomart="ensembl",dataset="mmusculus_gene_ensembl")
>>> Annotation = getAnnotation(mart, featureType="miRNA")
>>> annotatedPeak = annotatePeakInBatch(test.rangedData,
>>> AnnotationData=Annotation)
>>> as.data.frame(annotatedPeak)
>>>
>>> <factor> <IRanges> | <character> <character>
>>> <character> <numeric> <numeric> <character>
>>> MACS_peak_109 ENSMUSG00000089245 1 [54494876, 54496209] |
>>> MACS_peak_109 + ENSMUSG00000089245 54826062
>>> 54826166 upstream
>>> numeric> <numeric> <character>
>>> -331186 329853 NearestStart
>>>
>>>
>>> Now I would like to add miRNA Id as I already did when I annotated for
>>> TSS, but something goes wrong, any ideas how to solve it?
>>>
>>> library("org.Mm.eg.db")
>>> b<- addGeneIDs(annotatedPeak,"org.Mm.eg.db",c("symbol"))
>>> Error: No entrez identifier can be mapped by input data based on the
>>> feature_id_type. Please consider to use correct feature_id_type,
>>> orgAnn or annotatedPeak
>>>
>>>
>>> Thanks,
>>>
>>> Paolo
>>>
>>>
>>>> traceback()
>>> 2: stop("No entrez identifier can be mapped by input data based on the
>>> feature_id_type.\nPlease consider to use correct feature_id_type,
>>> orgAnn or annotatedPeak\n",
>>> call. = FALSE)
>>> 1: addGeneIDs(annotatedPeak, "org.Mm.eg.db", c("symbol"))
>>>> sessionInfo()
>>> R version 2.15.0 (2012-03-30)
>>> Platform: i386-pc-mingw32/i386 (32-bit)
>>>
>>> locale:
>>> [1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252
>>> LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C
>>> [5] LC_TIME=Italian_Italy.1252
>>>
>>> attached base packages:
>>> [1] grid stats graphics grDevices utils datasets
>>> methods base
>>>
>>> other attached packages:
>>> [1] targetscan.Mm.eg.db_0.5.0 BiocInstaller_1.4.7
>>> org.Mm.eg.db_2.7.1 ChIPpeakAnno_2.4.0
>>> [5] limma_3.12.1 org.Hs.eg.db_2.7.1
>>> GO.db_2.7.1 RSQLite_0.11.1
>>> [9] DBI_0.2-5 AnnotationDbi_1.18.1
>>> BSgenome.Ecoli.NCBI.20080805_1.3.17 BSgenome_1.24.0
>>> [13] GenomicRanges_1.8.7 Biostrings_2.24.1
>>> IRanges_1.14.4 multtest_2.12.0
>>> [17] Biobase_2.16.0 biomaRt_2.12.0
>>> BiocGenerics_0.2.0 gplots_2.11.0
>>> [21] MASS_7.3-19 KernSmooth_2.23-8
>>> caTools_1.13 bitops_1.0-4.1
>>> [25] gdata_2.11.0 gtools_2.7.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] RCurl_1.91-1.1 splines_2.15.0 stats4_2.15.0
>>> survival_2.36-14 tools_2.15.0 XML_3.9-4.1
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
More information about the Bioconductor
mailing list