[R] Matching a pattern of vector of character strings in another vector of character strings
Marc Schwartz
marc_schwartz at me.com
Fri Dec 17 15:10:14 CET 2010
On Dec 17, 2010, at 7:58 AM, Liviu Andronic wrote:
> On Fri, Dec 17, 2010 at 2:34 PM, Jing Liu <quiet_jing0920 at hotmail.com> wrote:
>>> M<- matrix(c("0","0","1","1","0","1","1","0","0","*","1","1","0","1","*"),nrow=3)
>>> colnames(M)<- c("2006","2007","2008","2009","2010")
>>> M
>> 2006 2007 2008 2009 2010
>> [1,] "0" "1" "1" "*" "0"
>> [2,] "0" "0" "0" "1" "1"
>> [3,] "1" "1" "0" "1" "*"
>>
>>> pattern<- c("0","1")
>>
>> I would like to find, for each row, if it contains exactly the pattern of two character strings, beginning with a "0" and followed by a "1", i.e, exactly "0" "1". If it does, at which year?
>> E.g. It should return 2006 for row 1, 2008 for row 2 and 2008 for row 3.
>>
> I could only think of this
>> apply(M, 1, function(z) grep('01', paste(z, collapse='')))
> [1] 1 1 1
>> apply(M, 1, function(z) grepl('01', paste(z, collapse='')))
> [1] TRUE TRUE TRUE
>
> But it doesn't return the position of the matched string. So this
> isn't what you wanted.
>
> Regards
> Liviu
>
>
>> For as far as I know, the variations of the grep function group cannot search for a pattern that has 2 or more character strings. I could do it with a loop but I seek a more efficient way than a loop. How should I do it? Really appreciated for your help!!!
>>
>> Best regards,
>> Jing Liu
Try this:
> colnames(M)[regexpr("01", apply(M, 1, paste, collapse = ""))]
[1] "2006" "2008" "2008"
See ?regexpr for more info.
HTH,
Marc Schwartz
More information about the R-help
mailing list