[R] regexp problem

Gabor Grothendieck ggrothendieck at gmail.com
Fri Jul 1 17:08:12 CEST 2011


On Fri, Jul 1, 2011 at 11:02 AM, Rainer M Krug <r.m.krug at gmail.com> wrote:
> Hi
>
> I have a question concerning regexp - I want to select with grep all
> character strings which contain the numbers 11:20 (code below).
>
> At the moment I am using [], but that obviously does not work, as it matches
> each element in the []. Is there a way to specify that the regexp should
> match 11, but not 1?
>
> Here is the code code:
>
> x <- paste("suff", 1:40, "pref", sep="_")
> x
> ##  [1] "suff_1_pref"  "suff_2_pref"  "suff_3_pref"  "suff_4_pref"
>  "suff_5_pref"
> ##  [6] "suff_6_pref"  "suff_7_pref"  "suff_8_pref"  "suff_9_pref"
>  "suff_10_pref"
> ## [11] "suff_11_pref" "suff_12_pref" "suff_13_pref" "suff_14_pref"
> "suff_15_pref"
> ## [16] "suff_16_pref" "suff_17_pref" "suff_18_pref" "suff_19_pref"
> "suff_20_pref"
> ## [21] "suff_21_pref" "suff_22_pref" "suff_23_pref" "suff_24_pref"
> "suff_25_pref"
> ## [26] "suff_26_pref" "suff_27_pref" "suff_28_pref" "suff_29_pref"
> "suff_30_pref"
> ## [31] "suff_31_pref" "suff_32_pref" "suff_33_pref" "suff_34_pref"
> "suff_35_pref"
> ## [36] "suff_36_pref" "suff_37_pref" "suff_38_pref" "suff_39_pref"
> "suff_40_pref"
>
> i <- paste(11:20, collapse=",")
> i
> ## [1] "11,12,13,14,15,16,17,18,19,20"
>
> grep(paste("suff_[", i, "]", sep=""), x, value=TRUE)
> ##  [1] "suff_1_pref"  "suff_2_pref"  "suff_3_pref"  "suff_4_pref"
>  "suff_5_pref"
> ##  [6] "suff_6_pref"  "suff_7_pref"  "suff_8_pref"  "suff_9_pref"
>  "suff_10_pref"
> ## [11] "suff_11_pref" "suff_12_pref" "suff_13_pref" "suff_14_pref"
> "suff_15_pref"
> ## [16] "suff_16_pref" "suff_17_pref" "suff_18_pref" "suff_19_pref"
> "suff_20_pref"
> ## [21] "suff_21_pref" "suff_22_pref" "suff_23_pref" "suff_24_pref"
> "suff_25_pref"
> ## [26] "suff_26_pref" "suff_27_pref" "suff_28_pref" "suff_29_pref"
> "suff_30_pref"
> ## [31] "suff_31_pref" "suff_32_pref" "suff_33_pref" "suff_34_pref"
> "suff_35_pref"
> ## [36] "suff_36_pref" "suff_37_pref" "suff_38_pref" "suff_39_pref"
> "suff_40_pref"
>
> ## But I would like to have
> ## [1] "suff_11_pref" "suff_12_pref" "suff_13_pref" "suff_14_pref"
> "suff_15_pref"
> ## [6] "suff_16_pref" "suff_17_pref" "suff_18_pref" "suff_19_pref"
> "suff_20_pref"

Here are two approaches:

grep("1\\d|20", x, value = TRUE)

grep(paste(11:20, collapse = "|"), x, value = TRUE)



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list