[R] using match to obtain non-sorted index values from non-sortedvector
Folkes, Michael
Michael.Folkes at dfo-mpo.gc.ca
Wed Jul 9 22:13:20 CEST 2014
So nice!
Apply wins again.
Thanks David.
Michael
-----Original Message-----
From: David L Carlson [mailto:dcarlson at tamu.edu]
Sent: July-09-14 1:11 PM
To: Folkes, Michael; r-help at r-project.org
Subject: RE: using match to obtain non-sorted index values from
non-sortedvector
There may be a faster way, but
> sapply(Tset, function(x) which(pop.df$pop==x))
[1] 5 4 2
-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Folkes, Michael
Sent: Wednesday, July 9, 2014 2:58 PM
To: r-help at r-project.org
Subject: [R] using match to obtain non-sorted index values from
non-sorted vector
Hello all,
I've been struggling with the best way to find index values from a large
vector with elements that will match elements of a subset vector [the
table argument in match()].
BUT the index values can't come out sorted (as we'd get in which(X %in%
Y) ).
My 'population' vector can't be sorted.
pop.df <- data.frame(pop=c(1,6,4,3,10))
The subset: Tset = c(10,3,6)
So I'd like to get these index values (from pop.df) , in this order:
5,4,2
If it could be sorted I could use:
which(sort(pop.df$pop) %in% sort(Tset))
But sorting will cause more grief later, so best not mess with it.
Here is my hopefully adequate MWE of a solution. I'm keen to see if
anybody has a better suggestion.
Thanks!
_____________________
###BEGIN R
#pop is the full set of values, it has no info on their ranking
# I don't want to sort these data. They need to remain in this order.
pop.df <- data.frame(pop=c(1,6,4,3,10))
#rank.df is my dataframe that tells me the top three rankings (derived
elsewhere)
rank.df <- data.frame(rank=1:3, Tset = c(10,3,6)) # Target set
#match.df will be my source of row index based on rank
match.df <- data.frame(match.vec= match(pop.df$pop, table=rank.df$Tset),
index.vec=1:nrow(pop.df))
#rank.df will now include the index location in the pop.df where I can
find the top three ranks.
rank.df <- merge(rank.df, match.df, by.x='rank', by.y='match.vec')
rank.df
####END
_______________________________________________________
Michael Folkes
Salmon Stock Assessment
Canadian Dept. of Fisheries & Oceans
Pacific Biological Station
3190 Hammond Bay Rd.
Nanaimo, B.C., Canada
V9T-6N7
Ph (250) 756-7264 Fax (250) 756-7053 Michael.Folkes at dfo-mpo.gc.ca
<mailto:Michael.Folkes at dfo-mpo.gc.ca>
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list