[BioC] Genes annotated to GO:0031281 using org.Hs.eg.db

James W. MacDonald jmacdon at uw.edu
Thu Mar 20 15:38:00 CET 2014


Hi Tim,

Note that you _always_ use a set of genes for hyperGTest. Even if you 
started with an array, you can only count a significantly differentially 
expressed gene once.

The difference here is that you used the org.Hs.eg.db package as the 
annotation rather than some array annotation package, so 
probeSetSummary() won't work, as it is trying to match things back to 
probe IDs.

So you can do something like this:

gn2go <- select(org.Hs.eg.db, geneIds(hgOverGO), "GOALL")
gos <- summary(hgOverGO)[,1]
golst <- lapply(gos, function(x) gn2go[gn2go[,2] %in% x,])

Best,

Jim



On 3/19/2014 7:03 PM, Tim Smith wrote:
> Hi James,
>
> Hmmm....I seem to get an error if I use probeSetSummary. As I 
> mentioned before I run the hyperGTest using a set of genes (and not a 
> set of probes).
>
> Here is what I get:
>
> hgOverGO <- hyperGTest(paramsGO)
>
> > hgOverGO
> Gene to GO BP Conditional test for over-representation
> 3843 GO BP ids tested (220 have p < 0.05)
> Selected gene set size: 318
>     Gene universe size: 6088
>     Annotation package: org.Hs.eg
>
> > class(hgOverGO)
> [1] "GOHyperGResult"
> attr(,"package")
> [1] "GOstats"
>
> > head(summary(hgOverGO))
>       GOBPID Pvalue OddsRatio   ExpCount Count Size                 Term
> 1 GO:0007268 2.658284e-07  2.645556 19.0653745    43  365     
>  synaptic transmission
> 2 GO:0001975 1.299252e-06 21.622722  0.6790407     7   13    response 
> to amphetamine
> 3 GO:0035637 2.726576e-06  2.320191 22.8784494    46  438 
> multicellular organismal signaling
> 4 GO:0072511 1.195032e-05  3.566708  6.2680683    19  120         
>  divalent inorganic cation transport
> 5 GO:0018958 1.788607e-05  9.280645  1.2536137     8   24 
> phenol-containing compound metabolic process
> 6 GO:0042886 2.537415e-04  3.066360  5.9546649    16  114           
>  amide transport
>
>
> > ps <- probeSetSummary(hgOverGO,pvalue=0.05)
> Error in (function (classes, fdef, mtable)  :
>   unable to find an inherited method for function ‘columns’ for 
> signature ‘"function"’
>
> > traceback()
> 7: stop(gettextf("unable to find an inherited method for function %s 
> for signature %s",
>  sQuote(fdef at generic), sQuote(cnames)), domain = NA)
> 6: (function (classes, fdef, mtable)
>    {
>        methods <- .findInheritedMethods(classes, fdef, mtable)
>        if (length(methods) == 1L)
>  return(methods[[1L]])
>        else if (length(methods) == 0L) {
>            cnames <- paste0("\"", sapply(classes, as.character),
>                "\"", collapse = ", ")
>  stop(gettextf("unable to find an inherited method for function %s for 
> signature %s",
>  sQuote(fdef at generic), sQuote(cnames)), domain = NA)
>        }
>        else stop("Internal error in finding inherited methods; didn't 
> return a unique method",
>            domain = NA)
>  })(list("function"), function (x)
>  standardGeneric("columns"), <environment>)
> 5: columns(db)
> 4: match(x, table, nomatch = 0L)
> 3: map %in% columns(db)
> 2: getAnnMap(ids, annotation(result))
> 1: probeSetSummary(hgOverGO, pvalue = 0.05)
>
>
> Am I calling the probeSetSummary function incorrectly?
>
>
> thanks!!
>
>
> On Wednesday, March 19, 2014 5:54 PM, Tim Smith 
> <tim_smith_666 at yahoo.com> wrote:
> Hi James,
>
> Thanks for the reply! I had 371 genes ( gene universe ~ 7k genes) for 
> which I was checking enrichment and I got this term as one of the 
> significant terms. The details are:
>
>                GOBPID      Pvalue OddsRatio    ExpCount Count Size     
>             Term
> GO:0031281 0.021301601 10.113271  0.22913929     2   15               
>             positive regulation of cyclase activity
>
> I used GOstats for this analysis. If Count = 2, then shouldn't there 
> be two genes that are directly annotated to this term?
>
>
>
>
>
> On Wednesday, March 19, 2014 10:17 AM, James W. MacDonald 
> <jmacdon at uw.edu <mailto:jmacdon at uw.edu>> wrote:
>
> Hi Tim,
>
> There may not be any IDs mapped to that term directly. You can use
> GO2ALLEGS, which maps all direct and child terms to Entrez Gene IDs.
>
> > get("GO:0031281", org.Hs.egGO2ALLEGS)
>      IDA     IDA     TAS     TAS     IDA     TAS IEA     TAS TAS
> TAS
>     "49"   "116"   "116"   "117"   "135"   "135" "136"   "136" "140"
> "153"
>      IDA     IDA     TAS     IDA     IEA     IDA IDA     TAS IDA
> IEA
>    "154"   "155"   "554"   "796"   "796"   "799" "1394"  "1394" "1812"
> "1812"
>      IDA     IEA     NAS     TAS     TAS     IEA TAS     TAS IEA
> TAS
>   "1816"  "1816"  "1816"  "1816"  "1909"  "2692" "2696"  "2740" "2774"
> "2778"
>      IDA     ISS     ISS     IBA     IBA     IMP ISS     TAS TAS
> TAS
>   "2852"  "3973"  "4763"  "4842"  "4843"  "4846" "4846"  "4914" "4915"
> "5032"
>      ISS     NAS     IEA     IDA     IEA     TAS     TAS
>   "5578"  "5894"  "7077"  "7432"  "7434" "10486" "10487"
>
> Or the more powerful select() method:
>
> > select(org.Hs.eg.db, "GO:0031281", c("ENTREZID", "SYMBOL"), "GOALL")
>          GOALL EVIDENCEALL ONTOLOGYALL ENTREZID SYMBOL
> 1  GO:0031281         IDA          BP       49       ACR
> 2  GO:0031281         IDA          BP      116   ADCYAP1
> 3  GO:0031281         TAS          BP      116   ADCYAP1
> 4  GO:0031281         TAS          BP      117 ADCYAP1R1
> 5  GO:0031281         IDA          BP      135   ADORA2A
> 6  GO:0031281         TAS          BP      135   ADORA2A
> 7  GO:0031281         IEA          BP      136   ADORA2B
> 8  GO:0031281         TAS          BP      136   ADORA2B
> 9  GO:0031281         TAS          BP      140    ADORA3
> 10 GO:0031281         TAS          BP      153     ADRB1
> 11 GO:0031281         IDA          BP      154     ADRB2
> 12 GO:0031281         IDA          BP      155     ADRB3
> 13 GO:0031281         TAS          BP      554     AVPR2
> 14 GO:0031281         IDA          BP      796     CALCA
> 15 GO:0031281         IEA          BP      796     CALCA
> 16 GO:0031281         IDA          BP      799     CALCR
> 17 GO:0031281         IDA          BP     1394     CRHR1
> 18 GO:0031281         TAS          BP     1394     CRHR1
> 19 GO:0031281         IDA          BP     1812      DRD1
> 20 GO:0031281         IEA          BP     1812      DRD1
> 21 GO:0031281         IDA          BP     1816      DRD5
> 22 GO:0031281         IEA          BP     1816      DRD5
> 23 GO:0031281         NAS          BP     1816      DRD5
> 24 GO:0031281         TAS          BP     1816      DRD5
> 25 GO:0031281         TAS          BP     1909     EDNRA
> 26 GO:0031281         IEA          BP     2692     GHRHR
> 27 GO:0031281         TAS          BP     2696      GIPR
> 28 GO:0031281         TAS          BP     2740     GLP1R
> 29 GO:0031281         IEA          BP     2774      GNAL
> 30 GO:0031281         TAS          BP     2778      GNAS
> 31 GO:0031281         IDA          BP     2852     GPER1
> 32 GO:0031281         ISS          BP     3973     LHCGR
> 33 GO:0031281         ISS          BP     4763       NF1
> 34 GO:0031281         IBA          BP     4842      NOS1
> 35 GO:0031281         IBA          BP     4843      NOS2
> 36 GO:0031281         IMP          BP     4846      NOS3
> 37 GO:0031281         ISS          BP     4846      NOS3
> 38 GO:0031281         TAS          BP     4914     NTRK1
> 39 GO:0031281         TAS          BP     4915     NTRK2
> 40 GO:0031281         TAS          BP     5032    P2RY11
> 41 GO:0031281         ISS          BP     5578     PRKCA
> 42 GO:0031281         NAS          BP     5894      RAF1
> 43 GO:0031281         IEA          BP     7077     TIMP2
> 44 GO:0031281         IDA          BP     7432       VIP
> 45 GO:0031281         IEA          BP     7434     VIPR2
> 46 GO:0031281         TAS          BP    10486      CAP2
> 47 GO:0031281         TAS          BP    10487      CAP1
> Warning message:
> In .generateExtraRows(tab, keys, jointype) :
>    'select' resulted in 1:many mapping between keys and return rows
>
> Best,
>
> Jim
>
>
>
> On 3/19/2014 4:28 AM, Tim Smith wrote:
> > Hi,
> >
> > I was trying to get the genes annotated to the GO term "GO:0031281". 
> My code:
> >
> > library(org.Hs.eg.db)
> >
> > genes <- get("GO:0031281", org.Hs.egGO2EG)
> >
> > When I run the code, I get:
> >
> >> genes <- get("GO:0031281", org.Hs.egGO2EG)
> > Error in .checkKeys(value, Rkeys(x), x at ifnotfound 
> <mailto:x at ifnotfound>) :
> >    value for "GO:0031281" not found
> >
> > If I check in AMIGO for this GO term, it seems to have many gene 
> products for Homo sapiens. Am I doing something wrong? Is there an 
> alternate package that I can try, just to double check the results?
> >
> > thanks!
> >     [[alternative HTML version deleted]]
> >
> >
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> -- 
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
>     [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org <mailto:Bioconductor at r-project.org>
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list