[R] Select top three values from data frame

Ottorino-Luca Pantani ottorino-luca.pantani at unifi.it
Wed Aug 26 12:33:22 CEST 2009


Noah Silverman ha scritto:
>
> I only have a few values in my example, but the real data set might 
> have 20-100 rows with A="X".  So how do I pick just the three highest 
> ones?
>
> -N
>
>
> On 8/26/09 2:46 AM, Ottorino-Luca Pantani wrote:
>> df.mydata[df.mydata$A=="X" AND df.mydata$C < 2, ]
>> will do the job ?
>>
>> 8rino
>>
>> Noah Silverman ha scritto:
>>> Hi,
>>>
>>> I'm trying to find an easy way to do this.
>>>
>>> I want to select the top three values of a specific column in a 
>>> subset of rows in a data.frame.  I'll demonstrate.
>>>
>>> A    B    C
>>> x    2    1
>>> x    4    1
>>> x    3    2
>>> y    1    5
>>> y    2    6
>>> y    3    8
>>>
>>>
>>> I want the top 3 values of B from the data.frame where A=X and C <2
>>>
>>> I could extract all the rows where C<2, then sort by B, then take 
>>> the first 3.  But that seems like the wrong way around, and it also 
>>> will get messy with real data of over 100 columns.
>>>
>>> Any suggestions?
>>>
my.data <- cbind.data.frame(expand.grid(A = c("X",  "Y"), B = 1:100),  C 
= rnorm(100))
myA.data <- my.data[my.data$A == "X", ]
myA.sorted.data <- myA.data[order(myA.data$C, decreasing=TRUE), ][1:3.]

Do this solve your problem ?

-- 
Ottorino-Luca Pantani, Università di Firenze
Dip. Scienza del Suolo e Nutrizione della Pianta
P.zle Cascine 28 50144 Firenze Italia
Tel 39 055 3288 202 (348 lab) Fax 39 055 333 273 
OLPantani at unifi.it  http://www4.unifi.it/dssnp/




More information about the R-help mailing list