[R] partial matches across rows not columns
jim holtman
jholtman at gmail.com
Tue Jun 8 23:15:13 CEST 2010
Is this what you are looking for:
> # assume females start with "A"
> # extract first part if female from ID
> x.id <- sub("(A[[:digit:]]+).*", "\\1", x$ID)
> # now see if this pattern matches first part of TO_ID
> x.match <- x.id == substring(x$TO_ID, 1, nchar(x.id))
> # here are the ones that would be eliminated
> x[x.match,]
FROM TO DIST ID HR DD MM YY ANIMAL DAY TO_ID TO_ANIMAL
4 1 4 7.57332 A1 1 30 9 9 1 1 A1.1 7
5 1 1 7.57332 A1.1 1 30 9 9 7 1 A1 1
20 1 4 7.25773 A1 2 30 9 9 1 1 A1.1 7
22 1 2 7.25773 A1.1 2 30 9 9 7 1 A1 1
35 1 3 0.60443 A1 3 30 9 9 1 1 A1.1 7
38 1 2 0.60443 A1.1 3 30 9 9 7 1 A1 1
>
>
On Tue, Jun 8, 2010 at 1:43 PM, RCulloch <ross.culloch at dur.ac.uk> wrote:
>
> Hi R users,
>
> I am trying to omit rows of data based on partial matches an example of my
> data (seal_dist) is below:
>
> A quick break down of my coding and why I need to answer this - I am dealing
> with a colony of seals where for example A1 is a female with pup and A1.1 is
> that female's pup, the important part of the data here is DIST which tells
> the distance between one seal (ID) and another (TO_ID). What I want to do is
> take a mean for these data for a nearest neighbour analysis but I want to
> omit any cases where there is the distance between a female and her pup,
> i.e. in the previous e.g. omit rows where A1 and A1.1 occur.
>
> I have looked at grep and pmatch but these appear to work across columns and
> don't appear to do what I'm looking to do,
>
> If anyone can point me in the right direction, I'd be most greatful,
>
> Best wishes,
>
> Ross
>
>
> FROM TO DIST ID HR DD MM YY ANIMAL DAY TO_ID TO_ANIMAL
> 2 1 2 4.81803 A1 1 30 9 9 1 1 MALE1 12
> 3 1 3 2.53468 A1 1 30 9 9 1 1 A2 3
> 4 1 4 7.57332 A1 1 30 9 9 1 1 A1.1 7
> 5 1 1 7.57332 A1.1 1 30 9 9 7 1 A1 1
> 6 1 2 7.89665 A1.1 1 30 9 9 7 1 MALE1 12
> 7 1 3 6.47847 A1.1 1 30 9 9 7 1 A2 3
> 9 1 1 2.53468 A2 1 30 9 9 3 1 A1 1
> 10 1 2 2.59051 A2 1 30 9 9 3 1 MALE1 12
> 12 1 4 6.47847 A2 1 30 9 9 3 1 A1.1 7
> 13 1 1 4.81803 MALE1 1 30 9 9 12 1 A1 1
> 15 1 3 2.59051 MALE1 1 30 9 9 12 1 A2 3
> 16 1 4 7.89665 MALE1 1 30 9 9 12 1 A1.1 7
> 17 1 1 3.85359 A1 2 30 9 9 1 1 MALE1 12
> 19 1 3 4.88826 A1 2 30 9 9 1 1 A2 3
> 20 1 4 7.25773 A1 2 30 9 9 1 1 A1.1 7
> 21 1 1 9.96431 A1.1 2 30 9 9 7 1 MALE1 12
> 22 1 2 7.25773 A1.1 2 30 9 9 7 1 A1 1
> 23 1 3 5.71725 A1.1 2 30 9 9 7 1 A2 3
> 25 1 1 8.73759 A2 2 30 9 9 3 1 MALE1 12
> 26 1 2 4.88826 A2 2 30 9 9 3 1 A1 1
> 28 1 4 5.71725 A2 2 30 9 9 3 1 A1.1 7
> 30 1 2 3.85359 MALE1 2 30 9 9 12 1 A1 1
> 31 1 3 8.73759 MALE1 2 30 9 9 12 1 A2 3
> 32 1 4 9.96431 MALE1 2 30 9 9 12 1 A1.1 7
> 33 1 1 7.95399 A1 3 30 9 9 1 1 MALE1 12
> 35 1 3 0.60443 A1 3 30 9 9 1 1 A1.1 7
> 36 1 4 1.91136 A1 3 30 9 9 1 1 A2 3
> 37 1 1 8.29967 A1.1 3 30 9 9 7 1 MALE1 12
> 38 1 2 0.60443 A1.1 3 30 9 9 7 1 A1 1
> 40 1 4 1.43201 A1.1 3 30 9 9 7 1 A2 3
> 41 1 1 9.71659 A2 3 30 9 9 3 1 MALE1 12
> 42 1 2 1.91136 A2 3 30 9 9 3 1 A1 1
> 43 1 3 1.43201 A2 3 30 9 9 3 1 A1.1 7
> 46 1 2 7.95399 MALE1 3 30 9 9 12 1 A1 1
> 47 1 3 8.29967 MALE1 3 30 9 9 12 1 A1.1 7
> 48 1 4 9.71659 MALE1 3 30 9 9 12 1 A2 3
> --
> View this message in context: http://r.789695.n4.nabble.com/partial-matches-across-rows-not-columns-tp2247757p2247757.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
More information about the R-help
mailing list