[R] text matching and substitution
Stephan Kolassa
Stephan.Kolassa at gmx.de
Sat Mar 28 18:08:01 CET 2009
Hi Simeon,
I played around a little with Vectorize and mapply, but I couldn't make
it work :-( So, my best guess would be a simple loop like this:
result <- as.character(paste(letters,colours(),"stuff",LETTERS))
target <- c("red","blue","green","gray")
for ( new.color in target ) { result[grep(new.color,result)] <- new.color }
Best of luck,
Stephan
simeon duckworth schrieb:
> stephan
>
> sorry for not being clear - but thats exactly what i want.
>
> i'd like to replace every complex string that contains "red" with just
> "red", and then so on with "blue", "yellow" etc
>
> my data is of the form
>
> "xxxxx xx xx xxxxx red xx xxx xx"
> "xx xxx xxx xx blue xx xx xx xx x"
> "x xx xxxxxxxx xx xx xx xxxx red"
> "red xx xx xx xx xx"
> "xx xx xx xx xx xx"
> "xx x x x x xxxx"
>
> which i'd like to replace with
> "red"
> "blue"
> "red"
> "other"
> "other"
>
> thanks
>
>
> On Sat, Mar 28, 2009 at 2:38 PM, Stephan Kolassa <Stephan.Kolassa at gmx.de>wrote:
>
>> Hi Simeon,
>>
>> I'm slightly unclear on what exactly you are trying to achieve... Are you
>> trying to replace every entry of colours which *contains* "red" by "red",
>> dropping the rest of the entry? And same with "blue"?
>>
>> A short example "before & after" would be helpful...
>>
>> Best,
>> Stephan
>>
>>
>> simeon duckworth schrieb:
>>
>> thanks stephan. i'd been trying to make gsub work, but couldnt make it
>>> replace the whole expression. so i'd resorted to trying to loop with grep
>>> -
>>> but with two problems. firstly, i cant seem to make the loop 'remember'
>>> the substitutions it makes (see below). secondly, it feels like this is a
>>> really inefficient way of doing something quite simple anyhow.
>>>
>>> colours <- as.character(paste(letters,colours(),"stuff",LETTERS))
>>> target <- c("red","blue","green","gray")
>>> new.colour <-colours
>>> for (i in length(target)) {
>>> x <- target[i]
>>> new.colour[grep((x),new.colour)] <- x
>>> return(new.colour)
>>> }
>>>
>>>
>>>
>>>
>>> On Sat, Mar 28, 2009 at 9:45 AM, Stephan Kolassa <Stephan.Kolassa at gmx.de
>>>> wrote:
>>> Hi Simeon,
>>>> ?gsub
>>>>
>>>> HTH,
>>>> Stephan
>>>>
>>>> simeon duckworth schrieb:
>>>>
>>>> I am trying to simplify a text variable by matching and replacing it
>>>>> with
>>>>> a
>>>>> string in another vector
>>>>>
>>>>> so for example in
>>>>> colours <- paste(letters,colours(),"stuff",LETTERS)
>>>>>
>>>>> find and replace with ("red","blue","green","gray","yellow","other") -
>>>>> irrespective of case
>>>>>
>>>>> its a large dataset, so i'd like to be able to do this as efficiently as
>>>>> possible.
>>>>>
>>>>> thanks for any help
>>>>>
>>>>> [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>>
>>>>>
>
More information about the R-help
mailing list