[R] Using getSYMBOL, annotate package on a list with empty elements.

Martin Morgan mtmorgan at fhcrc.org
Sat Feb 13 16:09:56 CET 2010


On 02/13/2010 06:21 AM, David Winsemius wrote:
> 
> On Feb 13, 2010, at 3:18 AM, Sahil Seth wrote:
> 
>> Hi,
>> I have been trying to find a solution to this issue, but have not been
>> able
>> to so !
>> I am trying to use sapply on the function getSYMBOL,
> 
> The annotate package is from BioConductor.
> 
>> an extract from the list is:
>>> test.goP[13:14]
>> $`GO:0000050`
>>      IEA       IEA       IEA       IEA       TAS       TAS       TAS
>> IEA
>> "5270753" "5720725" "1690128" "4850681"  "110433" "2640544" "4900370"
>> "1430280"
>>      IEA       NAS       TAS       IEA
>> "6110044" "1170615" "6590546" "1690632"
>>
>> $`GO:0000052`
>> [1] NA
>>
>> goG=sapply(test.goP,getSYMBOL,data="hgu95av2")
> 
> I was a bit surprised to see a data= argument to sapply. That didn't
> seem typical but perhaps I am not aware of all the tricks available.

> args(sapply)
function (X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)

The ... arguments are passed to FUN. Here FUN is getSYMBOL

> args(getSYMBOL)
function (x, data)

so data="hgu95av2" is used in each application of getSYMBOL. This could
have been written as

  sapply(test.goP, getSYMBOL, "hgu95av2")

using positional matching to getSYMBOL's arguments. A 'trick' that is
sometimes useful is to use named arguments to force the apply to vary
the second (or other) argument, rather than the first.


>> error: "Error in .checkKeysAreWellFormed(keys) :
>>  keys must be supplied in a character vector with no NAs "
>> In this the 14th element has missing values, thus getSYMBOL raises
>> issues.
> 
> The way this is being displayed makes me think you are processing a list
> with named elements. You should use str to determine what test.goP
> really is. Then you will have a better idea what functions would be
> appropriate. you may want to compose a function to go with the sapply
> loop that omots the NA's:
> 
> ?na.omit
> 
> You should also post the results of dput or dump on the test objects you
> are working with. That way the list readers can also get access to that
> information.
>>
>> GetSYMBOL has to be given a char array, so a simple solution is infact to
>> delete the missing elements from the list.
>>
>> I have been trying to find a solution for it, but in vain:
>> tried: completecases(goP), na.omit(goP) and several other things.
>>
>> Any suggestions please ?

  sapply(is.na(test.goP), getSYMBOL, "hgu95av2")

> 
> In addition to the above suggestions ... post on the correct list?

  http://bioconductor.org/

Martin

> 
>> -- 
> 
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the R-help mailing list