[BioC] problem with getGOOntology
Robert Gentleman
rgentlem at fhcrc.org
Fri Apr 8 00:51:19 CEST 2005
Well, some FYI's seem in order
1) find("GOTERM") is pretty simple and would have solved the first
misapprehension
as would ls(pos="package:GOstats")
2) the metadata packages are "a package" if you have version 1.7.0 of
one of them, you had better have 1.7.0 of all of them (and you can find
for any one of them, what sources they were built from).
On Apr 7, 2005, at 3:29 PM, Francois Pepin wrote:
> Taking over from Sean (the original poster, who's sitting beside me) a
> little bit.
>
> The original problem was that some of the annotation packages referred
> (mouse4302) to a GO term that wasn't in the GO package. As Sean Davis
> correctly pointed out, it is in the version 1.7.0, while we had 1.6.5.
> I
> thought that the GOTERM env was in GOstats (which was up to date), and
> I
> wasn't aware of the GO package.
>
> That part of the problem is now fixed, and as far as we're concerned,
> so
> the code is working.
>
> The problem was with GOHyperG, which uses getGOOntology to limit itself
> to a given category of terms. In this particular case, the ordering is
> crucial. Between changing the ordering and giving the error, I'd rather
> have the error.
>
Me too - but see below for other issues
> The reason we're calling this a bug is that the documentation says that
> NA's would be produced:
> (from the bottom of ?getGOOntology)
> For 'getGOOntology' a vector of categories (the names of which are
> the original GO term names). Elements of this list that are 'NA'
> indicate term names for which there is no category (and hence they
> are not really term names).
>
> Ideally, NA's would be inserted. The next best solution would be to
> change the documentation and say that an error would be produced if any
> invalid term is inserted.
>
The intention is for NAs to propagate - but if you use the correct
meta-data packages this problem is not likely (unless you have a typo)
If you use mixes of meta-data packages (different version numbers) then
you really end up with garbage - the mappings will simply not be
correct; and perhaps we should do more to check for that - but that is
not so simple as it sounds.
I think the new method I posted will get you closer - but it seems that
you don't need it.
Robert
> Francois
>
> On Thu, 2005-04-07 at 18:17, Robert Gentleman wrote:
>> Yes, it probably requires a slightly better handling of missing values
>> (although that is a bit of a can of worms). I will try to see what can
>> be done in time for the next release.
>>
>> If you are interested you might try defining the following method -
>> largely untested as I am travelling -
>>
>> setMethod("Ontology", signature="ANY", function(object) if(
>> is.na(object) ) NA else callNextMethod())
>>
>> with that in place I get:
>>> getGOOntology("GO:11111")
>> GO:11111
>> NA
>>
>> which seems like what is wanted, but I am sure there will be
>> consequences. Let me know if that triggers more downstream nastiness -
>> as I am not just sure of where you are going (note also we don't seem
>> to have either sessionInfo for the original report nor a workable
>> example of what is wanted - these are kind of important)
>>
>> On Apr 7, 2005, at 2:59 PM, Francois Pepin wrote:
>>
>>> On Thu, 2005-04-07 at 17:58, Sean Davis wrote:
>>>
>>>>>> getGOOntology
>>>>> function (x)
>>>>> {
>>>>> if (!is.character(x))
>>>>> stop("need a character argument")
>>>>> if (length(x) == 0)
>>>>> return(character(0))
>>>>> wh <- mget(x, env = GOTERM, ifnotfound = NA)
>>>>> return(sapply(wh, Ontology))
>>>>> }
>>>>>
>>>>
>>>> My bad. The lowercase 'o' got me.
>>>>
>>>> If you haven't already done it, looks like a line:
>>>>
>>>> wh <- wh[!is.na(wh)]
>>>>
>>>> right before the return line might do the trick.
>>>
>>> Not quite, as it would mess the indices up if you were to load a list
>>> of
>>> terms.
>>>
>>> Instead of having c('MF', NA, NA, 'CC', 'BP') as a result you'd end
>>> up
>>> having c('MA','CC','BP') and think that the 3rd one is a biological
>>> process term rather than not being a GO term (and have no clue for
>>> the
>>> last 2).
>>>
>>> Francois
>>>
>>>
>>>> Sean
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>
>>>
>> +---------------------------------------------------------------------
>> --
>> ----------------+
>> | Robert Gentleman phone: (206) 667-7700
>> |
>> | Head, Program in Computational Biology fax: (206) 667-1319 |
>> | Division of Public Health Sciences office: M2-B865
>> |
>> | Fred Hutchinson Cancer Research Center
>> |
>> | email: rgentlem at fhcrc.org
>> |
>> +---------------------------------------------------------------------
>> --
>> ----------------+
>
>
+-----------------------------------------------------------------------
----------------+
| Robert Gentleman phone: (206) 667-7700
|
| Head, Program in Computational Biology fax: (206) 667-1319 |
| Division of Public Health Sciences office: M2-B865
|
| Fred Hutchinson Cancer Research Center
|
| email: rgentlem at fhcrc.org
|
+-----------------------------------------------------------------------
----------------+
+-----------------------------------------------------------------------
----------------+
| Robert Gentleman phone: (206) 667-7700
|
| Head, Program in Computational Biology fax: (206) 667-1319 |
| Division of Public Health Sciences office: M2-B865
|
| Fred Hutchinson Cancer Research Center
|
| email: rgentlem at fhcrc.org
|
+-----------------------------------------------------------------------
----------------+
More information about the Bioconductor
mailing list