[BioC] problem with getGOOntology

Robert Gentleman rgentlem at fhcrc.org
Fri Apr 8 00:51:19 CEST 2005


Well, some FYI's seem in order

1) find("GOTERM") is pretty simple and would  have solved the first  
misapprehension
   as would ls(pos="package:GOstats")

2) the metadata packages are "a package" if you have version 1.7.0 of  
one of them, you had better have 1.7.0 of all of them (and you can find  
for any one of them, what sources they were built from).


On Apr 7, 2005, at 3:29 PM, Francois Pepin wrote:

> Taking over from Sean (the original poster, who's sitting beside me) a
> little bit.
>
> The original problem was that some of the annotation packages referred
> (mouse4302) to a GO term that wasn't in the GO package. As Sean Davis
> correctly pointed out, it is in the version 1.7.0, while we had 1.6.5.  
> I
> thought that the GOTERM env was in GOstats (which was up to date), and  
> I
> wasn't aware of the GO package.
>
> That part of the problem is now fixed, and as far as we're concerned,  
> so
> the code is working.
>
> The problem was with GOHyperG, which uses getGOOntology to limit itself
> to a given category of terms. In this particular case, the ordering is
> crucial. Between changing the ordering and giving the error, I'd rather
> have the error.
>

Me too - but see below for other issues

> The reason we're calling this a bug is that the documentation says that
> NA's would be produced:
> (from the bottom of ?getGOOntology)
>      For 'getGOOntology' a vector of categories (the names of which are
>      the original GO term names). Elements of this list that are 'NA'
>      indicate term names for which there is no category (and hence they
>      are not really term names).
>
> Ideally, NA's would be inserted. The next best solution would be to
> change the documentation and say that an error would be produced if any
> invalid term is inserted.
>

  The intention is for NAs to propagate - but if you use the correct  
meta-data packages this problem is not likely (unless you have a typo)  
If you use mixes of meta-data packages (different version numbers) then  
you really end up with garbage - the mappings will simply not be  
correct; and perhaps we should do more to check for that - but that is  
not so simple as it sounds.

I think the new method I posted will get you closer - but it seems that  
you don't need it.

Robert


> Francois
>
> On Thu, 2005-04-07 at 18:17, Robert Gentleman wrote:
>> Yes, it probably requires a slightly better handling of missing values
>> (although that is a bit of a can of worms). I will try to see what can
>> be done in time for the next release.
>>
>> If you are interested you might try defining the following method -
>> largely untested as I am travelling -
>>
>> setMethod("Ontology", signature="ANY", function(object) if(
>> is.na(object) ) NA else    callNextMethod())
>>
>> with that in place I get:
>>> getGOOntology("GO:11111")
>> GO:11111
>>        NA
>>
>> which seems like what is wanted, but I am sure there will be
>> consequences. Let me know if that triggers more downstream nastiness -
>> as I am not just sure of where you are going (note also we don't seem
>> to have either sessionInfo for the  original report nor a workable
>> example of what is wanted - these are kind of important)
>>
>> On Apr 7, 2005, at 2:59 PM, Francois Pepin wrote:
>>
>>> On Thu, 2005-04-07 at 17:58, Sean Davis wrote:
>>>
>>>>>> getGOOntology
>>>>> function (x)
>>>>> {
>>>>>     if (!is.character(x))
>>>>>         stop("need a character argument")
>>>>>     if (length(x) == 0)
>>>>>         return(character(0))
>>>>>     wh <- mget(x, env = GOTERM, ifnotfound = NA)
>>>>>     return(sapply(wh, Ontology))
>>>>> }
>>>>>
>>>>
>>>> My bad.  The lowercase 'o' got me.
>>>>
>>>> If you haven't already done it, looks like a line:
>>>>
>>>> wh <- wh[!is.na(wh)]
>>>>
>>>> right before the return line might do the trick.
>>>
>>> Not quite, as it would mess the indices up if you were to load a list
>>> of
>>> terms.
>>>
>>> Instead of having c('MF', NA, NA, 'CC', 'BP') as a result you'd end  
>>> up
>>> having c('MA','CC','BP') and think that the 3rd one is a biological
>>> process term rather than not being a GO term (and have no clue for  
>>> the
>>> last 2).
>>>
>>> Francois
>>>
>>>
>>>> Sean
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>
>>>
>> +--------------------------------------------------------------------- 
>> --
>> ----------------+
>> | Robert Gentleman              phone: (206) 667-7700
>>           |
>> | Head, Program in Computational Biology   fax:  (206) 667-1319   |
>> | Division of Public Health Sciences       office: M2-B865
>>        |
>> | Fred Hutchinson Cancer Research Center
>>           |
>> | email: rgentlem at fhcrc.org
>>                           |
>> +--------------------------------------------------------------------- 
>> --
>> ----------------+
>
>
+----------------------------------------------------------------------- 
----------------+
| Robert Gentleman              phone: (206) 667-7700                    
          |
| Head, Program in Computational Biology   fax:  (206) 667-1319   |
| Division of Public Health Sciences       office: M2-B865               
       |
| Fred Hutchinson Cancer Research Center                                 
          |
| email: rgentlem at fhcrc.org                                              
                          |
+----------------------------------------------------------------------- 
----------------+


+----------------------------------------------------------------------- 
----------------+
| Robert Gentleman              phone: (206) 667-7700                    
          |
| Head, Program in Computational Biology   fax:  (206) 667-1319   |
| Division of Public Health Sciences       office: M2-B865               
       |
| Fred Hutchinson Cancer Research Center                                 
          |
| email: rgentlem at fhcrc.org                                              
                          |
+----------------------------------------------------------------------- 
----------------+



More information about the Bioconductor mailing list