[Bioc-devel] Oddity in hgu133plus2.db?

Marc Carlson mcarlson at fhcrc.org
Tue Dec 1 22:35:30 CET 2009


Hi Lasse,

Just to clarify and add some comments:  The behavior of Rkeys() on these
mappings is actually the same as it has been before, it has always
returned the total right keys (whether they have been mapped or not). 
To get mapped right keys you would want to use mappedRkeys().  The
difference is that now the possible unmapped right keys have been
expanded to include all possible keys (gene symbols for this example),
while before this was artificially subset to only the keys for genes
represented by the platform. 

Now for some platforms, this may mean that before the value of Rkeys(x)
could have sometimes been the same as mappedRkeys(x).  And this could
have happened if everything in 'x' was mapped on that platform.  But
this was only ever a serendipitous situation, and not meant to be relied
upon.  So if you want to really check for mapped keys you will should
use mappedRkeys().

I hope this clarifies things,


  Marc



Lasse Folkersen wrote:
> Ok. I'll put checks for gensymbols linking to NA values for probeset
> into my code in the future. Thank you so much for the info.
> Lasse
>
> 2009/11/25 James W. MacDonald <jmacdon at med.umich.edu>:
>   
>> I should change slightly what I have said. The hgu133plus2.db package in the
>> new version of BioC has changed quite a bit, and no longer contains much
>> data. Instead, it is a thin wrapper for the org.Hs.eg.db package.
>>
>> Since org.Hs.eg.db *does* contain mappings for this symbol, the Rkey exists:
>>
>>     
>>> grep("CD68", Rkeys(hgu133plus2SYMBOL), value=T)
>>>       
>> [1] "CD68"
>>
>> But this doesn't mean this gene product is interrogated by the hgu133plus2
>> chip. The Lkey for this symbol is NA, because a matching probeset is Not
>> Available on the hgu133plus2 chip.
>>
>> So the behavior is consistent - there is a Rkey for this symbol, but the
>> Lkey is NA.
>>
>> Best,
>>
>> Jim
>>
>>
>>
>> Lasse Folkersen wrote:
>>     
>>> I see. Its just when genes doesn't exist in a table they usually gives
>>> error messages, like this:
>>>
>>> get("whatever", revmap(hgu133plus2SYMBOL))
>>> Error in .checkKeys(value, Rkeys(x), x at ifnotfound) :
>>>  value for "whatever" not found
>>>
>>> instead of NA. When asking for existence
>>> "CD68" %in% Rkeys(hgu133plus2SYMBOL)
>>> it does gives TRUE
>>>
>>> So I thought it could have been a bug or un-wanted behaviour. But
>>> thanks for your answer.
>>> Lasse
>>>
>>> 2009/11/25 James MacDonald <jmacdon at med.umich.edu>:
>>>       
>>>> Hi Lasse,
>>>>
>>>> This gene doesn't exist in that table:
>>>>
>>>>         
>>>>> get("CD68", revmap(hgu133plus2SYMBOL))
>>>>>           
>>>> [1] NA
>>>>
>>>> It just so happens that selecting things the way you did returns an empty
>>>> ProbeAnnDbBimap, which when converted to character gives you character(0).
>>>>
>>>>         
>>>>> revmap(hgu133plus2SYMBOL)["CD68"]
>>>>>           
>>>> revmap(SYMBOL) submap for chip hgu133plus2 (object of class
>>>> "ProbeAnnDbBimap")
>>>>
>>>> Best,
>>>>
>>>> Jim
>>>>
>>>>
>>>>         
>>>>>>> Lasse Folkersen <lasse.folkersen at ki.se> 11/25/09 6:40 AM >>>
>>>>>>>               
>>>> I know it is a very specific case, but this seems to me like a general
>>>> error:
>>>>
>>>> in hgu133plus2.db package, using
>>>> as.character(revmap(hgu133plus2SYMBOL)["CD68"])
>>>> returns
>>>> named character(0)
>>>> Now, it may be that annotations change, but isn't it a mistake that
>>>> there exists an Rkey entry for the gene which links to nothing?
>>>> Usually genes with no known probesets just didn't exist in the
>>>> database at all.
>>>>
>>>> Best regards
>>>> Lasse
>>>>
>>>> _______________________________________________
>>>> Bioc-devel at stat.math.ethz.ch mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>>>
>>>> **********************************************************
>>>> Electronic Mail is not secure, may not be read every day, and should not
>>>> be used for urgent or sensitive issues
>>>>
>>>>         
>> --
>> James W. MacDonald, M.S.
>> Biostatistician
>> Douglas Lab
>> University of Michigan
>> Department of Human Genetics
>> 5912 Buhl
>> 1241 E. Catherine St.
>> Ann Arbor MI 48109-5618
>> 734-615-7826
>> **********************************************************
>> Electronic Mail is not secure, may not be read every day, and should not be
>> used for urgent or sensitive issues
>>
>>     
>
> _______________________________________________
> Bioc-devel at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>



More information about the Bioc-devel mailing list