[R] confused by lapply

Peter Ehlers ehlers at ucalgary.ca
Thu Feb 17 11:26:04 CET 2011


On 2011-02-16 09:42, Sam Steingold wrote:
> Description:
>
>       'lapply' returns a list of the same length as 'X', each element of
>       which is the result of applying 'FUN' to the corresponding element
>       of 'X'.
>
> I expect that when I do
>> lapply(vec,f)
> f would be called _once_ for each component of vec.
>
> this is not what I see:
>
> parse.num<- function (s) {
>    cat("parse.num1\n"); str(s)
>    s<- as.character(s)
>    cat("parse.num2\n"); str(s)
>    if (s == "N/A") return(s);
>    as.numeric(gsub("M$","e6",gsub("B$","e9",s)));
> }
>
>
>> vec
>       mcap
> 1  200.5B
> 2   19.1M
> 3  223.7B
> 4  888.0M
> 5  141.7B
> 6  273.5M
> 7 55.649B
>> str(vec)
> 'data.frame':	7 obs. of  1 variable:
>   $ mcap: Factor w/ 7 levels "141.7B","19.1M",..: 3 2 4 7 1 5 6
>> vec<-lapply(vec,parse.num)
> parse.num1
>   Factor w/ 7 levels "141.7B","19.1M",..: 3 2 4 7 1 5 6
> parse.num2
>   chr [1:7] "200.5B" "19.1M" "223.7B" "888.0M" "141.7B" "273.5M" ...
> Warning message:
> In if (s == "N/A") return(s) :
>    the condition has length>  1 and only the first element will be used
>
> i.e., somehow parse.num is called on the whole vector vec, not its
> components.
>
> what am I doing wrong?

Your 'vec' is NOT a vector. As your str(vec) clearly
shows, you have a *data.frame*. The components of a
data.frame are the columns (variables) of which you
have only one and your function is applied to that.
If you had two columns, parse.num would be applied
to each column.

So do this:

  lapply(vec[, 1], parse.num)


Peter Ehlers

>



More information about the R-help mailing list