[R] text matching and substitution

Stephan Kolassa Stephan.Kolassa at gmx.de
Sat Mar 28 18:08:01 CET 2009


Hi Simeon,

I played around a little with Vectorize and mapply, but I couldn't make 
it work :-( So, my best guess would be a simple loop like this:

result <- as.character(paste(letters,colours(),"stuff",LETTERS))
target <- c("red","blue","green","gray")
for ( new.color in target ) { result[grep(new.color,result)] <- new.color }

Best of luck,
Stephan


simeon duckworth schrieb:
> stephan
> 
> sorry for not being clear - but thats exactly what i want.
> 
> i'd like to replace every complex string that contains "red" with just
> "red", and then so on with "blue", "yellow" etc
> 
> my data is of the form
> 
> "xxxxx xx xx xxxxx  red xx xxx xx"
> "xx xxx xxx xx  blue xx xx xx xx x"
> "x xx xxxxxxxx xx xx xx xxxx red"
> "red xx xx xx xx xx"
> "xx xx xx xx xx xx"
> "xx x x x x xxxx"
> 
> which i'd like to replace with
> "red"
> "blue"
> "red"
> "other"
> "other"
> 
> thanks
> 
> 
> On Sat, Mar 28, 2009 at 2:38 PM, Stephan Kolassa <Stephan.Kolassa at gmx.de>wrote:
> 
>> Hi Simeon,
>>
>> I'm slightly unclear on what exactly you are trying to achieve... Are you
>> trying to replace every entry of colours which *contains* "red" by "red",
>> dropping the rest of the entry? And same with "blue"?
>>
>> A short example "before & after" would be helpful...
>>
>> Best,
>> Stephan
>>
>>
>> simeon duckworth schrieb:
>>
>>  thanks stephan.  i'd been trying to make gsub work, but couldnt make it
>>> replace the whole expression.  so i'd resorted to trying to loop with grep
>>> -
>>> but with two problems.   firstly, i cant seem to make the loop 'remember'
>>> the substitutions it makes (see below).  secondly, it feels like this is a
>>> really inefficient way of doing something quite simple anyhow.
>>>
>>> colours <- as.character(paste(letters,colours(),"stuff",LETTERS))
>>> target <- c("red","blue","green","gray")
>>> new.colour <-colours
>>> for (i in length(target)) {
>>>    x <- target[i]
>>>    new.colour[grep((x),new.colour)] <- x
>>>    return(new.colour)
>>>    }
>>>
>>>
>>>
>>>
>>> On Sat, Mar 28, 2009 at 9:45 AM, Stephan Kolassa <Stephan.Kolassa at gmx.de
>>>> wrote:
>>>  Hi Simeon,
>>>> ?gsub
>>>>
>>>> HTH,
>>>> Stephan
>>>>
>>>> simeon duckworth schrieb:
>>>>
>>>>  I am trying to simplify a text variable by matching and replacing it
>>>>> with
>>>>> a
>>>>> string in another vector
>>>>>
>>>>> so for example in
>>>>> colours <- paste(letters,colours(),"stuff",LETTERS)
>>>>>
>>>>> find and replace with ("red","blue","green","gray","yellow","other")  -
>>>>> irrespective of case
>>>>>
>>>>> its a large dataset, so i'd like to be able to do this as efficiently as
>>>>> possible.
>>>>>
>>>>> thanks for any help
>>>>>
>>>>>       [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>>
>>>>>
>




More information about the R-help mailing list