[R] How to get multiple partial matches?
jim holtman
jholtman at gmail.com
Thu Sep 7 02:01:38 CEST 2006
Try using 'grep' and regular expressions:
> x <- "72 5S_F_1 501 567
+ 7700 5S_F_2 338 611
+ 7517 5S_F_3 412 467
+ 10687 5S_F_4 380 428
+ 4870 5S_F_5 315 368
+ 6035 5S_F_6 300 359
+ 3826 5S_F_7 350 386
+ 8754 5S_F_8 450 473
+ 6399 5S_F_9 439 494
+ 749 5S_F_10 334 384
+ "
> df <- read.table(textConnection(x))
> df
V1 V2 V3 V4
1 72 5S_F_1 501 567
2 7700 5S_F_2 338 611
3 7517 5S_F_3 412 467
4 10687 5S_F_4 380 428
5 4870 5S_F_5 315 368
6 6035 5S_F_6 300 359
7 3826 5S_F_7 350 386
8 8754 5S_F_8 450 473
9 6399 5S_F_9 439 494
10 749 5S_F_10 334 384
> # select only ones with '5S_F_1'
> df[grep('5S_F_1', as.character(df$V2)),]
V1 V2 V3 V4
1 72 5S_F_1 501 567
10 749 5S_F_10 334 384
>
>
On 9/6/06, Sarah Tucker <sltucker15 at yahoo.com> wrote:
> Hi,
>
> I'm very new to R, and am not at all a software
> programmer of any sort. I appreciate any help you
> may have. I have figured out how to get my data into
> a dataframe and order it alphabetically according to a
> particular column. Now, I would like to seperate out
> certain rows based on partial character matches. Here
> is an (extremely) abreviated example of my data set
>
> Probe Ch1 Median - B Ch1 Mean - B
> 72 5S_F_1 501 567
> 7700 5S_F_2 338 611
> 7517 5S_F_3 412 467
> 10687 5S_F_4 380 428
> 4870 5S_F_5 315 368
> 6035 5S_F_6 300 359
> 3826 5S_F_7 350 386
> 8754 5S_F_8 450 473
> 6399 5S_F_9 439 494
> 749 5S_F_10 334 384
>
> I would like to be able to select out all rows with,
> for example, "5S_F_" in the Probe column (there are
> non-"5S_F_" containing values in the real, larger data
> set).
>
> I think pmatch does this for instances where there is
> only 1 match, but I would like to recover all the
> matches. I have tried to use charmatch, match,
> pmatch, agrep and grep for this purpose, but with no
> luck.
>
> When I grep for "5S_F_" with value = T, I get
> "character(0)"
> Adding wildcards (either "*" or ".") does not change
> this outcome.
>
> I thought maybe the underscores were messing it up, so
> I tried to grep "5S*" with value = T, and I get a long
> list of numbers back
>
> [1] "55" "95" "56" "57" "58" "59" "65"
> "75" "85" "105"
> [11] "115" "125" "135" "5" "5" "5" "5"
> "5" "5" "5"
>
> These numbers make no sense to me. They don't seem to
> correlate with where the "5S"'s occur in the
> dataframe, and they don't look like any values in the
> Probe column (there are no numeric vaules in the Probe
> column, just strings of character digit combinations).
>
> How can I select out all the rows with the same
> partial character match?
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem you are trying to solve?
More information about the R-help
mailing list