[R] highest and second highest value in row for each combination
jim holtman
jholtman at gmail.com
Fri Feb 11 01:38:43 CET 2011
here is another way of doing it:
> set.seed(19)
>
> area<-c(rep(1,10),rep(2,10),rep(3,10),rep(4,10),rep(5,10))
> type<-c(rep(1:10,5))
> a<-rnorm(50)
> b<-rnorm(50)
> c<-rnorm(50)
> d<-rnorm(50)
> df<-cbind(area,type,a,b,c,d)
> df1 <- data.frame(df)
> require(reshape2)
> df.melt <- melt(df1, id=c('area', 'type'))
> result <- do.call(rbind,
+ lapply(split(df.melt, list(df.melt$area, df.melt$type),
drop=TRUE), function(x){
+ head(x[order(x$value, decreasing=TRUE),], 2) # get at most
the first two if present
+ })
+ )
>
> result
area type variable value
1.1.51 1 1 b 1.70366970
1.1.101 1 1 c 0.79101298
2.1.161 2 1 d 1.56797593
2.1.61 2 1 b 0.79868725
3.1.21 3 1 a 1.42342348
3.1.121 3 1 c 0.44547975
4.1.131 4 1 c 1.72745545
4.1.31 4 1 a 1.50474144
5.1.141 5 1 c 1.72521942
5.1.191 5 1 d 0.52466470
On Thu, Feb 10, 2011 at 12:55 PM, Phil Spector
<spector at stat.berkeley.edu> wrote:
> Alain -
> Here's a reproducible data set:
>
> set.seed(19)
> area<-c(rep(1,10),rep(2,10),rep(3,10),rep(4,10),rep(5,10))
> type<-c(rep(1:10,5))
> a<-rnorm(50)
> b<-rnorm(50)
> c<-rnorm(50)
> d<-rnorm(50)
> df<-cbind(area,type,a,b,c,d)
>
> First I'll make a helper function to operate on one row of the data frame:
>
> get2 = function(x){
> y = x[-c(1,2)]
> oy = order(y,decreasing=TRUE)
> nms = colnames(df)[-c(1,2)]
> data.frame(area=rep(x[1],2),type=rep(x[2],2),
> max=y[oy[1:2]],colname=nms[oy[1:2]])
> }
>
> Now I can use apply, do.call and rbind to get the answer:
>
>> answer = do.call(rbind,apply(df,1,get2))
>> head(answer)
>
> area type max colname
> b 1 1 1.7036697 b
> c 1 1 0.7910130 c
> c1 1 2 2.4576579 c
> a 1 2 0.3885812 a
> c2 1 3 1.2363598 c
> a1 1 3 -0.3443333 a
>
> (My numbers differ from yours because you didn't specify
> a seed for the random number generator)
>
> I'm not exactly sure how to form your column "combination", though.
>
> - Phil Spector
> Statistical Computing Facility
> Department of Statistics
> UC Berkeley
> spector at stat.berkeley.edu
>
>
> On Thu, 10 Feb 2011, Alain D. wrote:
>
>> Dear R-List,
>>
>> I have a dataframe
>>
>> area<-c(rep(1,10),rep(2,10),rep(3,10),rep(4,10),rep(5,10))
>> type<-c(rep(1:10,5))
>> a<-rnorm(50)
>> b<-rnorm(50)
>> c<-rnorm(50)
>> d<-rnorm(50)
>> df<-cbind(area,type,a,b,c,d)
>>
>>
>> df
>> area type a b
>> c d
>> [1,] 1 1 0.45608192 0.240378547 2.05208079 -1.18827462
>> [2,] 1 2 -0.12119506 -0.028078577 -2.64323695 -0.83923441
>> [3,] 1 3 0.09066133 -1.134069619 1.53344812 -0.15670239
>> [4,] 1 4 -1.34505241 1.919941172 -1.02090099 0.75664358
>> [5,] 1 5 -0.29279617 -0.314955019 -0.88809266 2.22282022
>> [6,] 1 6 -0.59697893 -0.652937746 1.05132400 -0.02469151
>> [7,] 1 7 -1.18199400 0.728165962 -1.51419348 0.65640976
>> [8,] 1 8 -0.72925659 0.303514237 0.79758488 0.93444350
>> [9,] 1 9 -1.60080508 -0.187562633 0.51288428 -0.55692877
>> [10,] 1 10 0.54373268 -0.494994392 0.52902381 1.12938122
>> [11,] 2 1 -1.29675664 -0.644990784 -2.44067511 -0.18489544
>> [12,] 2 2 0.86330699 1.458038882 1.17514710 1.32896878
>> [13,] 2 3 0.30069402 1.361211939 0.84757211 1.14502761
>> ...
>>
>> Now I want to have for each combination of area and type the name and
>> corresponding value of the two columns with the highest and second highest
>> value a,b,c,d.
>> In the above example it should be something like
>>
>> combination max colname
>> 11 2.05 c
>> 11 0.46 a
>> 12 -0.03 b
>> 12 -0.12 a
>> ...
>>
>> (It might be arranged differently, though)
>>
>> Can anyone help?
>>
>> Thank you in advance!
>>
>> Alain
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
More information about the R-help
mailing list