[BioC] GOHyperG for KEGG
Dick Beyer
dbeyer at u.washington.edu
Wed Sep 14 19:09:00 CEST 2005
Hi Tao,
I tried out your KEGGHyperG function. Seems to work just great. Thanks very much,
Dick
*******************************************************************************
Richard P. Beyer, Ph.D. University of Washington
Tel.:(206) 616 7378 Env. & Occ. Health Sci. , Box 354695
Fax: (206) 685 4696 4225 Roosevelt Way NE, # 100
Seattle, WA 98105-6099
http://depts.washington.edu/ceeh/ServiceCores/FC5/FC5.html
http://staff.washington.edu/~dbeyer
*******************************************************************************
--- "Shi, Tao" <shidaxia at yahoo.com> wrote:
Message: 8
Date: Tue, 6 Sep 2005 10:27:05 -0700 (PDT)
From: "Shi, Tao" <shidaxia at yahoo.com>
Subject: Re: [BioC] GOHyperG for KEGG
To: Gunnar Wrobel <bioc at gunnarwrobel.de>
Cc: bioconductor at stat.math.ethz.ch
Message-ID: <20050906172705.66173.qmail at web52703.mail.yahoo.com>
Content-Type: text/plain; charset=iso-8859-1
Thank you very much, Gunnar. I'll try that.
At same time, I wrote a function by myself, which I totally stole from GOHyperG.
Just want to
share it with everybody. Please let me know if there are any bugs!
...Tao
=============================================================================
KEGGHyperG <-
function (geneIDs, lib = "hgu95av2") {
getDataEnv <- function(name, lib) {
get(paste(lib, name, sep = ""), mode = "environment")
}
require(lib, character.only = TRUE) || stop("need data package", lib)
if (any(duplicated(geneIDs))) stop("input IDs must be unique")
keggV <- as.list(getDataEnv("PATH2PROBE", lib))
whWeHave <- sapply(keggV, function(y) {
if (is.na(y) || length(y) == 0)
return(FALSE)
ids = unique(unlist(y))
any(geneIDs %in% ids)
})
keggV <- keggV[whWeHave]
keggV <- sapply(keggV, function(x) {
if(any(grep("AFFX",x))) {
return(x[-grep("AFFX",x)])
} else {
return(x)
}
} ) ## get rid of control probes
bad <- sapply(keggV, function(x) (length(x) == 1 && is.na(x)))
keggV <- keggV[!bad]
cIDs <- unique(unlist(keggV))
nIDs <- length(cIDs)
keggCounts <- sapply(keggV, length)
ourIDs <- unique(geneIDs[!is.na(geneIDs)])
ours <- ourIDs[!duplicated(ourIDs)]
whGood <- ours[ours %in% cIDs]
nInt = length(whGood)
if (nInt == 0) { warning("no interesting genes found") }
useCts <- sapply(keggV, function(x) sum(whGood %in% x))
pvs <- phyper(useCts - 1, nInt, nIDs - nInt, keggCounts, lower.tail = FALSE)
ord <- order(pvs)
return(list(pvalues = pvs[ord], keggCounts = keggCounts[ord],
chip = lib, kegg2Affy = keggV, intCounts = useCts[ord], numIDs = nIDs,
numInt = nInt, intIDs = geneIDs))
}
================================================================================
==
--- Gunnar Wrobel <bioc at gunnarwrobel.de> wrote:
> > Is there a similar function like GOHyperG that works on KEGG? It seems
there is no such thing
> > back in Feb. 05
(https://stat.ethz.ch/pipermail/bioconductor/2005-February/007532.html). Any
> > updates?
> Hi Tao,
>
> you might try to do this with goCluster. It does the same kind of
> calculation as GOHyperG but can use any kind of annotation.
>
> Cheers
>
> Gunnar
>
More information about the Bioconductor
mailing list