[R] string parsing

David Winsemius dwinsemius at comcast.net
Wed Feb 16 21:10:11 CET 2011


On Feb 16, 2011, at 2:26 PM, David Winsemius wrote:

>
> On Feb 16, 2011, at 2:09 PM, Sam Steingold wrote:
>
>>> * David Winsemius <qjvafrzvhf at pbzpnfg.arg> [2011-02-16 13:33:32  
>>> -0500]:
>>>
>>>> parse.num <- function (s) {
>>>> as.numeric(gsub("M$","e6",gsub("B$","e9",s))); }
>>>
>>> data[1] <- parse.num( data[[1]] )  # as.numeric and gsub are  
>>> vectorized
>>
>> because parse.num turned out to not be as simple as that: I need to
>> handle "N/A" specially.
>>
>> parse.num1 <- function (s) {
>> if (length(s) != 1) stop("parse.num",s);
>> s <- as.character(s);
>> if (s == "N/A") return(NA);
>
> Ouch! That can be simplified to:

I meant to say (and at one time did):

is.na(s) <- (s == "N/A")
>
>   #  and then all done inside one function.

But then failed to notice that my drag and drop moved it to the  
version below (since usually I get tripped up when the Mac Mail.app  
duplicates rather than moves.)
>
>> as.numeric(gsub("M$","e6",gsub("B$","e9",s)));
>> }
>>
>> parse.num <- function (v) {
>> for (i in 1:length(v)) v[[i]] <- parse.num1(v[[i]])
>> v;
>> }
>>
>> actually... wait a sec...
>> shouldn't this work?
>>
>> bad <- (data[1] == "N/A")
>> data[1][bad] <- NA
>> data[1][!bad] <- as.numeric(gsub("M$","e6",gsub("B$","e9",data[1])))
>
> It might not, since at this point you should have different length  
> vectors.
>
> (I'm also not sure that the  listobj[<n>][<logical>] <- <vector>  
> construction would work, but I really don't know about that one.)
>
> parse.num2 <- function (s) {
> is.na(s) <- (s == "N/A")
> as.numeric(gsub("M$","e6",gsub("B$","e9",s)))
>      }
>
> Then call as I suggested
>
>

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list