[R] partial matches across rows not columns
Jannis
bt_jannis at yahoo.de
Tue Jun 8 23:39:17 CEST 2010
I did not go too deep into your zoology problem ;-) but as far as I
understood you, you want to omit all rows where
ID and TO_ID are A1 and A1.1, (or A2....) correct?
If the data you send us is all the data and if there do not occour any
different situations the following should be sufficient:
Transfer the vectors ID an TO_ID to values without the . and the number
following it (e.g. A1.1 -> A1):
ID.clean<-gsub("^.*[?]| .*$", "",data$ID)
TO_ID.clean<-gsub("^.*[?]| .*$", "",data$TO_ID)
And then use logical indexing:
data.clean = data[ID.clean==TO_ID.clean,]
HTH
Jannis
RCulloch schrieb:
> Hi R users,
>
> I am trying to omit rows of data based on partial matches an example of my
> data (seal_dist) is below:
>
> A quick break down of my coding and why I need to answer this - I am dealing
> with a colony of seals where for example A1 is a female with pup and A1.1 is
> that female's pup, the important part of the data here is DIST which tells
> the distance between one seal (ID) and another (TO_ID). What I want to do is
> take a mean for these data for a nearest neighbour analysis but I want to
> omit any cases where there is the distance between a female and her pup,
> i.e. in the previous e.g. omit rows where A1 and A1.1 occur.
>
> I have looked at grep and pmatch but these appear to work across columns and
> don't appear to do what I'm looking to do,
>
> If anyone can point me in the right direction, I'd be most greatful,
>
> Best wishes,
>
> Ross
>
>
> FROM TO DIST ID HR DD MM YY ANIMAL DAY TO_ID TO_ANIMAL
> 2 1 2 4.81803 A1 1 30 9 9 1 1 MALE1 12
> 3 1 3 2.53468 A1 1 30 9 9 1 1 A2 3
> 4 1 4 7.57332 A1 1 30 9 9 1 1 A1.1 7
> 5 1 1 7.57332 A1.1 1 30 9 9 7 1 A1 1
> 6 1 2 7.89665 A1.1 1 30 9 9 7 1 MALE1 12
> 7 1 3 6.47847 A1.1 1 30 9 9 7 1 A2 3
> 9 1 1 2.53468 A2 1 30 9 9 3 1 A1 1
> 10 1 2 2.59051 A2 1 30 9 9 3 1 MALE1 12
> 12 1 4 6.47847 A2 1 30 9 9 3 1 A1.1 7
> 13 1 1 4.81803 MALE1 1 30 9 9 12 1 A1 1
> 15 1 3 2.59051 MALE1 1 30 9 9 12 1 A2 3
> 16 1 4 7.89665 MALE1 1 30 9 9 12 1 A1.1 7
> 17 1 1 3.85359 A1 2 30 9 9 1 1 MALE1 12
> 19 1 3 4.88826 A1 2 30 9 9 1 1 A2 3
> 20 1 4 7.25773 A1 2 30 9 9 1 1 A1.1 7
> 21 1 1 9.96431 A1.1 2 30 9 9 7 1 MALE1 12
> 22 1 2 7.25773 A1.1 2 30 9 9 7 1 A1 1
> 23 1 3 5.71725 A1.1 2 30 9 9 7 1 A2 3
> 25 1 1 8.73759 A2 2 30 9 9 3 1 MALE1 12
> 26 1 2 4.88826 A2 2 30 9 9 3 1 A1 1
> 28 1 4 5.71725 A2 2 30 9 9 3 1 A1.1 7
> 30 1 2 3.85359 MALE1 2 30 9 9 12 1 A1 1
> 31 1 3 8.73759 MALE1 2 30 9 9 12 1 A2 3
> 32 1 4 9.96431 MALE1 2 30 9 9 12 1 A1.1 7
> 33 1 1 7.95399 A1 3 30 9 9 1 1 MALE1 12
> 35 1 3 0.60443 A1 3 30 9 9 1 1 A1.1 7
> 36 1 4 1.91136 A1 3 30 9 9 1 1 A2 3
> 37 1 1 8.29967 A1.1 3 30 9 9 7 1 MALE1 12
> 38 1 2 0.60443 A1.1 3 30 9 9 7 1 A1 1
> 40 1 4 1.43201 A1.1 3 30 9 9 7 1 A2 3
> 41 1 1 9.71659 A2 3 30 9 9 3 1 MALE1 12
> 42 1 2 1.91136 A2 3 30 9 9 3 1 A1 1
> 43 1 3 1.43201 A2 3 30 9 9 3 1 A1.1 7
> 46 1 2 7.95399 MALE1 3 30 9 9 12 1 A1 1
> 47 1 3 8.29967 MALE1 3 30 9 9 12 1 A1.1 7
> 48 1 4 9.71659 MALE1 3 30 9 9 12 1 A2 3
>
More information about the R-help
mailing list