[BioC] revmap question

Herve Pages hpages at fhcrc.org
Tue Oct 7 19:16:39 CEST 2008


James W. MacDonald wrote:
> Hi Raffaele,
> 
> rcaloger wrote:
>> Hi,
>> I  found very interesting the possibility of using reversing the 
>> mapping by revmap in the XXXX.db annotation databases.
>>
>> However, I have two problems:
>> 1) if  I use:
>> egs <- c("1", "100", "1000")
>> unlist(mget(egs, revmap(hgu133plus2ENTREZID)))
>>
>> I am getting not only the probesets associated to the three EGs:
>>            1          1001          1002          1003         10001
>>  "229819_at"  "1556117_at"   "204639_at" "216705_s_at"   "203440_at"
>>        10002         10003
>> "203441_s_at"   "237305_at"
> 
> Well, not really. This appears to be so because you are unlisting a 
> named list. Since the names have to be unique,

Well, that's were I don't follow the logic behind unlist() and I've always
found this "feature" pretty strange. unlist() won't even make a good job at
keeping the names unique:
   > unlist(list(AA=letters[1:3], AA2="bb"))
    AA1  AA2  AA3  AA2
    "a"  "b"  "c" "bb"
So mangling the names doesn't solve anything but just adds confusion.

IMO it would be better if unlist() was keeping the original names, even if that
means that they are not unique in the returned vector. At least I can do something
with it programmatically, and it's easy. With the mangled names, it's much harder
(there are a couple of serious pitfalls).

H.


> R adds an additional 
> integer to the end of duplicate names:
> 
>  > egs <- c("1", "100", "1000")
>  > mget(egs, revmap(hgu133plus2ENTREZID))
> $`1`
> [1] "229819_at"
> 
> $`100`
> [1] "1556117_at"  "204639_at"   "216705_s_at"
> 
> $`1000`
> [1] "203440_at"   "203441_s_at" "237305_at"
> 
>> There is any possibility to avoid this problem?
>>
>> 2) if in the egs vector is present an eg (6333) that is not present in 
>> the annotation database I get the following error:
>> egs <- c("1", "100", "1000", "6333")
>> unlist(mget(egs, revmap(hgu133plus2ENTREZID)))
>>
>> Error in .checkKeys(value, Rkeys(x), x at ifnotfound) :
>>  value for "6333" not found
>>
>> There is any possibility to make a query that simply avoid the 
>> unmapped keys?
> 
> Yes. The help for mget is a bit confusing on this point, but you need to 
> use the argument ifnotfound = NA.
> 
>  > egs <- c("1", "100", "1000", "6333")
>  > mget(egs, revmap(hgu133plus2ENTREZID), ifnotfound = NA)
> $`1`
> [1] "229819_at"
> 
> $`100`
> [1] "1556117_at"  "204639_at"   "216705_s_at"
> 
> $`1000`
> [1] "203440_at"   "203441_s_at" "237305_at"
> 
> $`6333`
> [1] NA
> 
> Best,
> 
> Jim
> 
> 
> 
>>
>>
>> Many thanks
>> Raffaele
>>
>



More information about the Bioconductor mailing list