[R] Matching a pattern of vector of character strings in another vector of character strings
David Winsemius
dwinsemius at comcast.net
Fri Dec 17 15:39:32 CET 2010
On Dec 17, 2010, at 8:34 AM, Jing Liu wrote:
>
> Dear all,
>
> My question is illustrated by the following example:
>
> I have a matrix M:
>
>> M<-
>> matrix
>> (c
>> ("0","0","1","1","0","1","1","0","0","*","1","1","0","1","*"),nrow=3)
>> colnames(M)<- c("2006","2007","2008","2009","2010")
>> M
> 2006 2007 2008 2009 2010
> [1,] "0" "1" "1" "*" "0"
> [2,] "0" "0" "0" "1" "1"
> [3,] "1" "1" "0" "1" "*"
>
>> pattern<- c("0","1")
>
> I would like to find, for each row, if it contains exactly the
> pattern of two character strings, beginning with a "0" and followed
> by a "1", i.e, exactly "0" "1". If it does, at which year?
> E.g. It should return 2006 for row 1, 2008 for row 2 and 2008 for
> row 3.
>
> For as far as I know, the variations of the grep function group
> cannot search for a pattern that has 2 or more character strings. I
> could do it with a loop but I seek a more efficient way than a loop.
> How should I do it? Really appreciated for your help!!!
You can just paste() each row with collapse="._" and now can use grep-
ish functions as you were hoping to use.
> m2 <- apply(M, 1, paste, collapse="_")
> colnames(M)[(regexpr("0_1", m2)+1)/2] # assuming number of
characters per element are all 1
[1] "2006" "2008" "2008"
--
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list