[BioC] revmap question
lgautier at altern.org
lgautier at altern.org
Thu Oct 9 17:29:26 CEST 2008
> James W. MacDonald wrote:
>> Hi Raffaele,
>>
>> rcaloger wrote:
>>> Hi,
>>> I found very interesting the possibility of using reversing the
>>> mapping by revmap in the XXXX.db annotation databases.
>>>
>>> However, I have two problems:
>>> 1) if I use:
>>> egs <- c("1", "100", "1000")
>>> unlist(mget(egs, revmap(hgu133plus2ENTREZID)))
>>>
>>> I am getting not only the probesets associated to the three EGs:
>>> 1 1001 1002 1003 10001
>>> "229819_at" "1556117_at" "204639_at" "216705_s_at" "203440_at"
>>> 10002 10003
>>> "203441_s_at" "237305_at"
>>
>> Well, not really. This appears to be so because you are unlisting a
>> named list. Since the names have to be unique,
>
> Well, that's were I don't follow the logic behind unlist() and I've always
> found this "feature" pretty strange. unlist() won't even make a good job
> at
> keeping the names unique:
> > unlist(list(AA=letters[1:3], AA2="bb"))
> AA1 AA2 AA3 AA2
> "a" "b" "c" "bb"
> So mangling the names doesn't solve anything but just adds confusion.
>
> IMO it would be better if unlist() was keeping the original names, even if
> that
> means that they are not unique in the returned vector. At least I can do
> something
> with it programmatically, and it's easy. With the mangled names, it's much
> harder
> (there are a couple of serious pitfalls).
>
The problem might originate in what one could perceive a flaw with lists
(or any named vectors for that matter) in allowing non-unique names.
Mangled names are shurely a headache, as well as the "get only the first
element with the given name while it was not known there were several
elements with the same name" behavior in R.
L.
> H.
>
>
>> R adds an additional
>> integer to the end of duplicate names:
>>
>> > egs <- c("1", "100", "1000")
>> > mget(egs, revmap(hgu133plus2ENTREZID))
>> $`1`
>> [1] "229819_at"
>>
>> $`100`
>> [1] "1556117_at" "204639_at" "216705_s_at"
>>
>> $`1000`
>> [1] "203440_at" "203441_s_at" "237305_at"
>>
>>> There is any possibility to avoid this problem?
>>>
>>> 2) if in the egs vector is present an eg (6333) that is not present in
>>> the annotation database I get the following error:
>>> egs <- c("1", "100", "1000", "6333")
>>> unlist(mget(egs, revmap(hgu133plus2ENTREZID)))
>>>
>>> Error in .checkKeys(value, Rkeys(x), x at ifnotfound) :
>>> value for "6333" not found
>>>
>>> There is any possibility to make a query that simply avoid the
>>> unmapped keys?
>>
>> Yes. The help for mget is a bit confusing on this point, but you need to
>> use the argument ifnotfound = NA.
>>
>> > egs <- c("1", "100", "1000", "6333")
>> > mget(egs, revmap(hgu133plus2ENTREZID), ifnotfound = NA)
>> $`1`
>> [1] "229819_at"
>>
>> $`100`
>> [1] "1556117_at" "204639_at" "216705_s_at"
>>
>> $`1000`
>> [1] "203440_at" "203441_s_at" "237305_at"
>>
>> $`6333`
>> [1] NA
>>
>> Best,
>>
>> Jim
>>
>>
>>
>>>
>>>
>>> Many thanks
>>> Raffaele
>>>
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
More information about the Bioconductor
mailing list