[BioC] GOHyperG for KEGG
Shi, Tao
shidaxia at yahoo.com
Tue Sep 6 19:27:05 CEST 2005
Thank you very much, Gunnar. I'll try that.
At same time, I wrote a function by myself, which I totally stole from GOHyperG. Just want to
share it with everybody. Please let me know if there are any bugs!
...Tao
=============================================================================
KEGGHyperG <-
function (geneIDs, lib = "hgu95av2") {
getDataEnv <- function(name, lib) {
get(paste(lib, name, sep = ""), mode = "environment")
}
require(lib, character.only = TRUE) || stop("need data package", lib)
if (any(duplicated(geneIDs))) stop("input IDs must be unique")
keggV <- as.list(getDataEnv("PATH2PROBE", lib))
whWeHave <- sapply(keggV, function(y) {
if (is.na(y) || length(y) == 0)
return(FALSE)
ids = unique(unlist(y))
any(geneIDs %in% ids)
})
keggV <- keggV[whWeHave]
keggV <- sapply(keggV, function(x) {
if(any(grep("AFFX",x))) {
return(x[-grep("AFFX",x)])
} else {
return(x)
}
} ) ## get rid of control probes
bad <- sapply(keggV, function(x) (length(x) == 1 && is.na(x)))
keggV <- keggV[!bad]
cIDs <- unique(unlist(keggV))
nIDs <- length(cIDs)
keggCounts <- sapply(keggV, length)
ourIDs <- unique(geneIDs[!is.na(geneIDs)])
ours <- ourIDs[!duplicated(ourIDs)]
whGood <- ours[ours %in% cIDs]
nInt = length(whGood)
if (nInt == 0) { warning("no interesting genes found") }
useCts <- sapply(keggV, function(x) sum(whGood %in% x))
pvs <- phyper(useCts - 1, nInt, nIDs - nInt, keggCounts, lower.tail = FALSE)
ord <- order(pvs)
return(list(pvalues = pvs[ord], keggCounts = keggCounts[ord],
chip = lib, kegg2Affy = keggV, intCounts = useCts[ord], numIDs = nIDs,
numInt = nInt, intIDs = geneIDs))
}
==================================================================================
--- Gunnar Wrobel <bioc at gunnarwrobel.de> wrote:
> > Is there a similar function like GOHyperG that works on KEGG? It seems there is no such thing
> > back in Feb. 05 (https://stat.ethz.ch/pipermail/bioconductor/2005-February/007532.html). Any
> > updates?
> Hi Tao,
>
> you might try to do this with goCluster. It does the same kind of
> calculation as GOHyperG but can use any kind of annotation.
>
> Cheers
>
> Gunnar
>
More information about the Bioconductor
mailing list