[R] Matching a pattern of vector of character strings in another vector of character strings

David Winsemius dwinsemius at comcast.net
Fri Dec 17 15:39:32 CET 2010


On Dec 17, 2010, at 8:34 AM, Jing Liu wrote:

>
> Dear all,
>
> My question is illustrated by the following example:
>
> I have a matrix M:
>
>> M<-  
>> matrix 
>> (c 
>> ("0","0","1","1","0","1","1","0","0","*","1","1","0","1","*"),nrow=3)
>> colnames(M)<- c("2006","2007","2008","2009","2010")
>> M
>     2006 2007 2008 2009 2010
> [1,] "0"  "1"  "1"  "*"  "0"
> [2,] "0"  "0"  "0"  "1"  "1"
> [3,] "1"  "1"  "0"  "1"  "*"
>
>> pattern<- c("0","1")
>
> I would like to find, for each row, if it contains exactly the  
> pattern of two character strings, beginning with a "0" and followed  
> by a "1", i.e, exactly "0" "1". If it does, at which year?
> E.g. It should return 2006 for row 1, 2008 for row 2 and 2008 for  
> row 3.
>
> For as far as I know, the variations of the grep function group  
> cannot search for a pattern that has 2 or more character strings. I  
> could do it with a loop but I seek a more efficient way than a loop.  
> How should I do it? Really appreciated for your help!!!

You can just paste() each row with collapse="._" and now can use grep- 
ish functions as you were hoping to use.

 > m2 <- apply(M, 1, paste, collapse="_")
 > colnames(M)[(regexpr("0_1", m2)+1)/2]  # assuming number of  
characters per element are all 1
[1] "2006" "2008" "2008"

-- 
David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list