[R] Matching a pattern of vector of character strings in another vector of character strings

Marc Schwartz marc_schwartz at me.com
Fri Dec 17 15:10:14 CET 2010


On Dec 17, 2010, at 7:58 AM, Liviu Andronic wrote:

> On Fri, Dec 17, 2010 at 2:34 PM, Jing Liu <quiet_jing0920 at hotmail.com> wrote:
>>> M<- matrix(c("0","0","1","1","0","1","1","0","0","*","1","1","0","1","*"),nrow=3)
>>> colnames(M)<- c("2006","2007","2008","2009","2010")
>>> M
>>     2006 2007 2008 2009 2010
>> [1,] "0"  "1"  "1"  "*"  "0"
>> [2,] "0"  "0"  "0"  "1"  "1"
>> [3,] "1"  "1"  "0"  "1"  "*"
>> 
>>> pattern<- c("0","1")
>> 
>> I would like to find, for each row, if it contains exactly the pattern of two character strings, beginning with a "0" and followed by a "1", i.e, exactly "0" "1". If it does, at which year?
>> E.g. It should return 2006 for row 1, 2008 for row 2 and 2008 for row 3.
>> 
> I could only think of this
>> apply(M, 1, function(z) grep('01', paste(z, collapse='')))
> [1] 1 1 1
>> apply(M, 1, function(z) grepl('01', paste(z, collapse='')))
> [1] TRUE TRUE TRUE
> 
> But it doesn't return the position of the matched string. So this
> isn't what you wanted.
> 
> Regards
> Liviu
> 
> 
>> For as far as I know, the variations of the grep function group cannot search for a pattern that has 2 or more character strings. I could do it with a loop but I seek a more efficient way than a loop. How should I do it? Really appreciated for your help!!!
>> 
>> Best regards,
>> Jing Liu


Try this:

> colnames(M)[regexpr("01", apply(M, 1, paste, collapse = ""))]
[1] "2006" "2008" "2008"


See ?regexpr for more info.

HTH,

Marc Schwartz



More information about the R-help mailing list