[BioC] Question RE: getEnrichedGo in ChIPpeakAnno package
Marc Carlson
mcarlson at fhcrc.org
Thu Jun 10 23:29:15 CEST 2010
Hi Noah,
Yeast is difficult because that community has a strong preference for
their classic IDs. The same is true in arabidopsis. This is why those
two organism packages have "sgd" and "tair" in their respective package
names. I have managed to keep the rest of the org packages entrez gene
centric however. Another thing you can do if you find yourself with a
similar problem is to use the org.Sc.sgdENTREZID provided by the
ord.Sc.sgd.db package. The package may be orf centric, but you can
still map to an entrez gene ID if you use the org.Sc.sgdENTREZID mapping.
Marc
On 06/10/2010 01:53 PM, Zhu, Julie wrote:
> Hi Noah,
>
> Yes, you are right that this is due to the differences between the org.Hs.eg.db and org.Sc.sgd.db. The org.Hs.eg.db is Entrez ID centric while org.Sc.sgd.db is orf centric. It would be nice if all the org.*.*.dbs have similar data structure and mapping. For now, I would suggest call getEnrichedGO function with a list of orfs using the following syntax. You need to first convert the list of Ensembl ID to orfs first.
>
> enrichedGO.Cse4 <- getEnrichedGO (orfs, feature_id_type="entrez_id", orgAnn="org.Sc.sgd.db", maxP=0.05, multiAdj =TRUE, minGOterm=5, multiAdjMethod="BH")
>
> Best regards,
>
> Julie
>
>
>
> On 6/10/10 4:15 PM, "Noah Dowell" <noahd at ucla.edu> wrote:
>
> Hello All,
>
> I couldn't find a solution to my question in the archives and my attempts have been unsuccessful so hopefully someone has some advice.
>
> I have analyzed my yeast ChIP-chip tiling array data using Starr and converted my list of chip-enriched regions to RangedData to make use of the peakOverlap and GOenrichment functions in ChIPpeakAnno. The annotatePeakInBatch function has worked nicely but I am stuck with the getEnrichedGO function. I think the problem may be due to differences between the org.Hs.eg.db and org.Sc.sgd.db. The org.Hs.eg.db has a mapping of ENSEMBL gene accession numbers to Entrez Gene identifiers, but the org.Sc.sgd.db completely lacks this and uses a mapping to SGD Gene Identifiers. As far as I can tell the getEnrichedGO function calls for a mapping to Entrez Gene ids thus the error I am showing below.
>
> Does anyone know of a work around for this?
>
> Thank you for your help.
>
> Noah
>
>
>
>> library(org.Sc.sgd.db)
>>
>
>> goTest <- getEnrichedGO(annoPeakChr1data, orgAnn = "org.Sc.sgd.db", maxP = 0.01, multiAdj =TRUE, minGOterm = 10, multiAdjMethod = "BH")
>>
> Error in get(paste(GOgenome, "ENSEMBL2EG", sep = "")) :
> object 'org.Sc.sgdENSEMBL2EG' not found
>
> ##also tried:
>
>
>> goTest <- getEnrichedGO(annoPeakChr1data, orgAnn = "org.Sc.sgd.db", feature_id_type= "ensembl_gene_id", maxP = 0.01, multiAdj =TRUE, minGOterm = 10, multiAdjMethod = "BH")
>>
> Error in get(paste(GOgenome, "ENSEMBL2EG", sep = "")) :
> object 'org.Sc.sgdENSEMBL2EG' not found
>
> #### here is what my annotatePeak Object looks like:
>
>
>> head(annoPeakChr1data)
>>
> RangedData with 6 rows and 9 value columns across 1 space
> space ranges | peak strand feature start_position end_position
> <character> <IRanges> | <character> <character> <character> <numeric> <numeric>
> 01 YAL069W I [ 16, 254] | 01 1 YAL069W 335 649
> 02 YAL067W-A I [ 2731, 2924] | 02 1 YAL067W-A 2480 2707
> 06 YAL062W I [29935, 29959] | 06 1 YAL062W 31568 32941
> 07 YAL062W I [30011, 30039] | 07 1 YAL062W 31568 32941
> 08 YAL062W I [31661, 31678] | 08 1 YAL062W 31568 32941
> 09 YAL062W I [31702, 31710] | 09 1 YAL062W 31568 32941
>
>
>
>
>
>
>
>
>
>
>> sessionInfo()
>>
> R version 2.11.0 (2010-04-22)
> i386-apple-darwin9.8.0
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] grid stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] org.Sc.sgd.db_2.4.1 rtracklayer_1.8.1 RCurl_1.3-1 bitops_1.0-4.1
> [5] Starr_1.4.0 affxparser_1.20.0 affy_1.26.0 Ringo_1.12.0
> [9] Matrix_0.999375-38 lattice_0.18-5 RColorBrewer_1.0-2 ChIPpeakAnno_1.4.0
> [13] limma_3.4.0 org.Hs.eg.db_2.4.1 GO.db_2.4.1 RSQLite_0.8-4
> [17] DBI_0.2-5 AnnotationDbi_1.10.0 BSgenome.Ecoli.NCBI.20080805_1.3.16 BSgenome_1.16.0
> [21] GenomicRanges_1.0.1 Biostrings_2.16.0 IRanges_1.6.0 multtest_2.4.0
> [25] Biobase_2.8.0 biomaRt_2.4.0
>
> loaded via a namespace (and not attached):
> [1] affyio_1.16.0 annotate_1.26.0 genefilter_1.30.0 MASS_7.3-5 preprocessCore_1.10.0 pspline_1.0-14
> [7] splines_2.11.0 survival_2.35-8 tools_2.11.0 XML_2.8-1 xtable_1.5-6
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
More information about the Bioconductor
mailing list