[BioC] Odd isolated problem with GOstats
Mark W Kimpel
mwkimpel at gmail.com
Sat Jan 5 21:05:47 CET 2008
Marc,
I have attempted to modify my code without success. The reason I was
requesting a patch is because I believe the problem is originating
within the summary function. Below is my code that generatesa
GOHyperGParams object, applies hyperGTest, and then tries to summarize
the result.
I took a look at the test.object created by hyperGTest to see if it
contained GO:ID's that I could remove or subset out as you suggest but
it does not contain any. These seem to be fetched my mget called from
within summary. I looked at the source code for summary and tried to
modify it by having mget return NA ifnotfound (see further below) but
this results in a problem with lapply when "standardGeneric for "Term"
defined from package "AnnotationDbi"" is the FUN applied to NA. So I am
stuck and would appreciate your help.
Please see my comment in the R code for where I think a fix should be
applied. I understand that this may only occur occasionally and only in
BioC-devel, but it would seem useful to those of us who do test BioC-devel.
Thanks, Mark
#############################################################
#MY CODE
params <- new("GOHyperGParams", geneIds = selectedEntrezIds,
universeGeneIds = entrez.universe.vec, annotation = annotationPckg,
ontology = "BP", pvalueCutoff = minP, conditional = TRUE,
testDirection = testDirection)
test.obj<-hyperGTest(params)
#this is where the blow-up occurs
output.table<-data.frame(summary(test.obj))
######################################################################
# GOstats code from hyperGtable.R
##################################################################
setMethod("summary", signature(object="GOHyperGResult"),
function(object, pvalue=pvalueCutoff(object),
categorySize=NULL, htmlLinks=FALSE) {
AMIGO_URL <-
"http://www.godatabase.org/cgi-bin/amigo/go.cgi?view=details&search_constraint=terms&depth=0&query=%s"
df <- callNextMethod(object=object, pvalue=pvalue,
categorySize=categorySize)
if (nrow(df) == 0) {
df$Term <- character(0)
return(df)
}
#This is where it would be nice to filter out invalid goIds but I can't
# figure out how to do it
goIds <- df[[1]]
goTerms <- sapply(mget(goIds, GOenv("TERM"), ifnotfound =
NA), Term)
if (htmlLinks) {
goIdUrls <- sapply(goIds,
function(x) sprintf(AMIGO_URL, x))
goTerms <- paste('<a href="', goIdUrls, '">', goTerms,
'</a>', sep="")
}
df$Term <- goTerms
df
})
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine
15032 Hunter Court, Westfield, IN 46074
(317) 490-5129 Work, & Mobile & VoiceMail
(317) 204-4202 Home (no voice mail please)
mwkimpel<at>gmail<dot>com
******************************************************************
Marc Carlson wrote:
> Marc Carlson wrote:
>> Mark W Kimpel wrote:
>>
>>> I have used a custom wrapper function to invoke GOstats for almost a
>>> year without problem. Now, with just one dataset, I am getting an error.
>>> Not sure if this is a bug or something very odd about just my one
>>> dataset. The GO:ID that yields the error is a valid ID, I checked. I am
>>> using the devel version and rechecked after just updating my packages.
>>>
>>> Any ideas as to what the problem might be?
>>>
>>> Loading required package: rat2302
>>> Error in .checkKeys(value, Lkeys(x), x at ifnotfound) :
>>> invalid key "GO:0016089"
>>>
>>> Enter a frame number, or 0 to exit
>>>
>>> 1: limma.contrast.output.func(AOP, fit2, rslt, contrast.num = 1,
>>> custom.contra
>>> 2: GO.anal.func(input.df = t.tab.annot.obj$sig.named.genes,
>>> annotationPckg = a
>>> 3: GOHyperMaxCats.func(optimizedParam)
>>> 4: data.frame(summary(test.obj))
>>> 5: summary(test.obj)
>>> 6: summary(test.obj)
>>> 7: .local(object, ...)
>>> 8: sapply(mget(goIds, GOenv("TERM")), Term)
>>> 9: lapply(X, FUN, ...)
>>> 10: is.vector(X)
>>> 11: mget(goIds, GOenv("TERM"))
>>> 12: mget(goIds, GOenv("TERM"))
>>> 13: `keys<-`(`*tmp*`, value = c("GO:0007268", "GO:0007214",
>>> "GO:0050877", "GO:0
>>> 14: `keys<-`(`*tmp*`, value = c("GO:0007268", "GO:0007214",
>>> "GO:0050877", "GO:0
>>> 15: switch(as.character(direction(x)), "1" = `Lkeys<-`(x, value), "-1" =
>>> `Rkeys
>>> 16: `Lkeys<-`(x, value)
>>> 17: `Lkeys<-`(x, value)
>>> 18: .checkKeys(value, Lkeys(x), x at ifnotfound)
>>>
>>>
>>> > sessionInfo()
>>> R version 2.7.0 Under development (unstable) (2007-12-20 r43739)
>>> x86_64-unknown-linux-gnu
>>>
>>> locale:
>>> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
>>>
>>> attached base packages:
>>> [1] splines tools stats graphics grDevices utils datasets
>>> [8] methods base
>>>
>>> other attached packages:
>>> [1] rat2302_2.0.1 affycoretools_1.11.2 annaffy_1.11.1
>>> [4] KEGG_2.0.1 GO_2.0.1 gcrma_2.11.1
>>> [7] matchprobes_1.11.0 biomaRt_1.13.6 RCurl_0.8-3
>>> [10] GOstats_2.5.0 Category_2.5.0 genefilter_1.17.7
>>> [13] survival_2.34 RBGL_1.15.7 annotate_1.17.3
>>> [16] xtable_1.5-2 GO.db_2.1.1 AnnotationDbi_1.1.8
>>> [19] RSQLite_0.6-4 DBI_0.2-4 graph_1.17.14
>>> [22] limma_2.13.3 affy_1.17.3 preprocessCore_1.1.5
>>> [25] affyio_1.7.9 Biobase_1.17.8
>>>
>>> loaded via a namespace (and not attached):
>>> [1] cluster_1.11.9 XML_1.93-2
>>>
>>>
>
> If you could show me what it is that your script is doing (some source
> code) then I might be better able to help you with this. But as it
> stands right now I have very few specifics about what you are trying to
> do. I can see you posted the sessionInfo() which helps me know what
> versions of things you are using. But what is this script that you
> speak of? What does that code actually look like? I am afraid that I
> don't have a patch for you because the annotation package is not
> actually broken, and is only missing information because of what was
> available at GO at the precise moment when that package was made. I
> can't tell you why GO decided to remove that particular term and then
> later on decided to add it back in, but I looked at data that we
> gathered from them to build annotations at different time points and
> that is what has actually happened in this case.
>
>>From your previous posts, it looks like your trouble is caused because
> you are missing a key and then looking for that missing key in an
> environment. If that is what is happening, then I think you should code
> things in a way so that you can filter out keys that are not present.
>
> More specifically I think that you probably just want to do some minor
> cleanup before you call something like mget() to make sure that your
> keys are actually present. Here is a quick example of how this might look:
>
> library(GO.db)
>
> myKeys = c("FAKEID","GO:0000006","GO:0000003")
>
> myKeys[myKeys %in% keys(GOTERM)]
>
>
> Hope this helps,
>
>
> Marc
>
>
>
More information about the Bioconductor
mailing list