[BioC] ChIPpeakAnno::getEnrichedGo crashes but I don't know why
Zhu, Lihua (Julie)
Julie.Zhu at umassmed.edu
Wed Jan 5 22:15:13 CET 2011
Eric,
You could convert the exon IDs associated with the peaks to ensemble gene
IDs and input these ensemble gene IDs to the getEnrichedGO function.
Best regards,
Julie
On 1/5/11 4:09 PM, "Eric Cabot" <elcabot at gmail.com> wrote:
> Hi Julie,
>
> It may be a while before I get back to you on this, because I did my
> mapping and ChIP-Seq analysis with Hg19 (NCBI 37), not Hg18 (NCBI 36).
> I'm also a little concerned about using transcription start site
> annotations rather than exons, because the the binding domains are not
> thought to be restricted to only promoters. Any suggestions?
>
> Eric
>
>
>
> Zhu, Lihua (Julie) wrote:
>> Eric,
>>
>> The annotated dataset has exon ID instead of gene ID while the getEnrichedGO
>> is expecting feature_id_type="ensembl_gene_id". For a list of supported
>> feature_id_type, please type ?getEnrichedGO.
>>
>> To use getEnrichedGO function, first get the TSS annotation.
>>
>> TSS.human.NCBI36 = getAnnotation(ENSEMBLE_GENES_MART, featureType="TSS")
>>
>> or use the build in TSS as
>>
>> data(TSS.human.NCBI36)
>>
>> Then annotate your peaks with TSS.human.NCBI36 followed by getEnrichedGO
>> call.
>>
>> Please let me know if this works for you.
>>
>> Best regards,
>>
>> Julie
>>
>>
>>
>>
>> On 1/5/11 12:29 PM, "Eric Cabot" <elcabot at gmail.com> wrote:
>>
>>> Hi Julie,
>>>
>>> Thank you for your response.
>>>
>>> Here is the sessionInfo and traceback output and also a few lines of
>>> "my_annotated_regions".
>>>
>>> Regards,
>>>
>>> Eric Cabot
>>>
>>>> sessionInfo()
>>> R version 2.12.1 (2010-12-16)
>>> Platform: x86_64-unknown-linux-gnu (64-bit)
>>>
>>> locale:
>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
>>> [9] LC_ADDRESS=C LC_TELEPHONE=C
>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] stats graphics grDevices utils datasets methods base
>>>
>>> other attached packages:
>>> [1] ChIPpeakAnno_1.6.0 limma_3.6.9
>>> [3] org.Hs.eg.db_2.4.6 GO.db_2.4.5
>>> [5] RSQLite_0.9-4 DBI_0.2-5
>>> [7] AnnotationDbi_1.12.0
>>> BSgenome.Ecoli.NCBI.20080805_1.3.16
>>> [9] BSgenome_1.18.2 GenomicRanges_1.2.2
>>> [11] Biostrings_2.18.2 IRanges_1.8.8
>>> [13] multtest_2.6.0 Biobase_2.10.0
>>> [15] biomaRt_2.6.0
>>>
>>> loaded via a namespace (and not attached):
>>> [1] MASS_7.3-9 RCurl_1.5-0 splines_2.12.1 survival_2.36-2
>>> [5] tools_2.12.1 XML_3.2-0
>>> my_enrichedGO<-getEnrichedGO(my_annotated_regions,orgAnn="org.Hs.eg.db",maxP
>>> =0
>>> .01,multiAdj=FALSE,minGOterm=1,feature_id_type="ensembl_gene_id")
>>> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { :
>>> argument is of length zero
>>>> traceback()
>>> 2: addAncestors(this.GO[this.GO[, 3] == "BP", ], "bp")
>>> 1: getEnrichedGO(FC2_annotated_regions, orgAnn = "org.Hs.eg.db",
>>> maxP = 0.01, multiAdj = FALSE, minGOterm = 1, feature_id_type =
>>> "ensembl_gene_id")
>>>
>>>
>>>
>>>> as.data.frame(my_annotated_regions[1:15,])
>>> space start end width names peak strand
>>> 1 1 241997936 241998205 270 R-10060 ENSE00001749374 R-10060 +
>>> 2 1 237109743 237110002 260 R-10082 ENSE00001643382 R-10082 +
>>> 3 1 236080267 236080415 149 R-10086 ENSE00001807176 R-10086 +
>>> 4 1 233853245 233853514 270 R-10096 ENSE00001776382 R-10096 +
>>> 5 1 233727956 233728104 149 R-10097 ENSE00001442190 R-10097 +
>>> 6 1 230728554 230728823 270 R-10108 ENSE00001731401 R-10108 +
>>> 7 1 229687129 229687277 149 R-10113 ENSE00001439385 R-10113 +
>>> 8 1 228943263 228943412 150 R-10121 ENSE00001903546 R-10121 +
>>> 9 1 218358885 218359176 292 R-10157 ENSE00001439386 R-10157 +
>>> 10 1 212254259 212254408 150 R-10179 ENSE00001624346 R-10179 +
>>> 11 1 210086264 210086513 250 R-10184 ENSE00001903225 R-10184 +
>>> 12 1 209863549 209863698 150 R-10185 ENSE00001336255 R-10185 +
>>> 13 1 207437117 207437264 148 R-10190 ENSE00001742112 R-10190 +
>>> 14 1 190352400 190352548 149 R-10246 ENSE00001782518 R-10246 +
>>> 15 1 184432607 184432755 149 R-10260 ENSE00001283926 R-10260 +
>>> feature start_position end_position insideFeature
>>> distancetoFeature
>>> 1 ENSE00001749374 241995237 241996089 downstream
>>> 2699
>>> 2 ENSE00001643382 237144639 237145008 upstream
>>> -34896
>>> 3 ENSE00001807176 236078715 236078821 downstream
>>> 1552
>>> 4 ENSE00001776382 233807017 233807237 downstream
>>> 46228
>>> 5 ENSE00001442190 233749750 233750272 upstream
>>> -21794
>>> 6 ENSE00001731401 230728406 230728586 overlapEnd
>>> 148
>>> 7 ENSE00001439385 229685652 229685769 downstream
>>> 1477
>>> 8 ENSE00001903546 228882063 228882416 downstream
>>> 61200
>>> 9 ENSE00001439386 218303137 218303294 downstream
>>> 55748
>>> 10 ENSE00001624346 212253973 212254092 downstream
>>> 286
>>> 11 ENSE00001903225 210111538 210111622 upstream
>>> -25274
>>> 12 ENSE00001336255 209859550 209859630 downstream
>>> 3999
>>> 13 ENSE00001742112 207438342 207438381 upstream
>>> -1225
>>> 14 ENSE00001782518 190331193 190331400 downstream
>>> 21207
>>> 15 ENSE00001283926 184446520 184446737 upstream
>>> -13913
>>> shortestDistance fromOverlappingOrNearest
>>> 1 1847 NearestStart
>>> 2 34637 NearestStart
>>> 3 1446 NearestStart
>>> 4 46008 NearestStart
>>> 5 21646 NearestStart
>>> 6 32 NearestStart
>>> 7 1360 NearestStart
>>> 8 60847 NearestStart
>>> 9 55591 NearestStart
>>> 10 167 NearestStart
>>> 11 25025 NearestStart
>>> 12 3919 NearestStart
>>> 13 1078 NearestStart
>>> 14 21000 NearestStart
>>> 15 13765 NearestStart
>>>
>>>
>>> Zhu, Lihua (Julie) wrote:
>>>> Hi Eric,
>>>>
>>>> Could you please post the session information with sessionInfo() command?
>>>> Could you please also send a few ensembl IDs in your annotated dataset?
>>>> Thanks!
>>>>
>>>> Best regards,
>>>>
>>>> Julie
>>>>
>>>>
>>>> On 1/4/11 6:51 PM, "Eric Cabot" <elcabot at gmail.com> wrote:
>>>>
>>>>> I am a relatively new Bioconductor user and I am trying to analyze some
>>>>> ChIP-seq results that came from QuEST using the ChIPpeakAnno package.
>>>>>
>>>>> After importing the regions of interest into RangedData objects and doing
>>>>> the following:
>>>>>
>>>>>
>>>> ENSEMBLE_GENES_MART<-useMart(biomart="ensembl",dataset="hsapiens_gene_ensem
>>>> bl
>>>> ">
>>>> )
>>>>> ENSEMBL_ExonPlus_Annotation<-getAnnotation(ENSEMBLE_GENES_MART,
>>>>> featureType="ExonPlusUtr")
>>>>>
>>>>>
>>>>> I had no problem annotating and generating a Venn diagram to show the
>>>>> overlaps between my three sets of peaks. To annotate, I used:
>>>>>
>>>>> annotated_regions=annotatePeakInBatch(myranged,
>>>>> AnnotationData=ENSEMBL_ExonPlus_Annotation)
>>>>>
>>>>>
>>>>> But I cannot seem to get the getEnrichedGo method to work on this (or my
>>>>> other two annotated regions). Here is a typical command line:
>>>>>
>>>>>
>>>>> my_enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg.db",maxP=
>>>>> 0.
>>>>> 01
>>>>> ,multiAdj=TRUE,minGOterm=1,
>>>>> multiAdjMethod="BH",feature_id_type="ensembl_gene_id")
>>>>>
>>>>> and here is a typical error message:
>>>>>
>>>>> enrichedGO<-getEnrichedGO(annotated_regions,orgAnn="org.Hs.eg.db",maxP=0.0
>>>>> 1,
>>>>> mu
>>>>> ltiAdj=TRUE,minGOterm=1,feature_id_type="ensembl_gene_id")
>>>>> Error in if (class(go.ids) != "matrix" | dim(go.ids)[2] < 4) { :
>>>>> argument is of length zero
>>>>>
>>>>>
>>>>> Which leads me to ask:
>>>>>
>>>>> 1) Is this error message supposed to be meaningful to me-i.e. a user-or is
>>>>> it something that I should be sending to the developer of the package?
>>>>>
>>>>> 2) Is there anything obvious from this that suggests what corrective
>>>>> action I should be taking?
>>>>>
>>>>>
>>>>> Eric Cabot
>>>>> University of Wisconsin
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>>
>>>>
>>
>>
>
More information about the Bioconductor
mailing list