[R] row selection based on median in data frame
Nick.Ellis@csiro.au
Nick.Ellis at csiro.au
Fri Apr 2 02:57:11 CEST 2004
> tmp
row.labels a b c
1 1 deadlift 7 13
2 2 squat 7 24
3 3 clean 7 10
4 4 deadlift 8 8
5 5 squat 8 20
6 6 clean 8 2
7 7 deadlift 9 5
8 8 squat 9 32
9 9 clean 9 19
> tapply(tmp$c,tmp$a,median)
clean deadlift squat
10 8 24
> tmp[tapply(1:nrow(tmp),tmp$a,function(i,x) {x <- x[i]; i[x==median(x)]}, x=tmp$c),]
row.labels a b c
3 3 clean 7 10
4 4 deadlift 8 8
2 2 squat 7 24
If you have multiple grouping variables g1,g2,g3 you simply include those in the 2nd argument:
tmp[tapply(1:nrow(tmp),tmp[c("gp1","gp2","gp3")],function(i,x) {x <- x[i]; i[x==median(x)]}, x=tmp$c),]
Nick Ellis
CSIRO Marine Research mailto:Nick.Ellis at csiro.au
PO Box 120 ph +61 (07) 3826 7260
Cleveland QLD 4163 fax +61 (07) 3826 7222
Australia http://www.marine.csiro.au
>
>
> ------------------------------
>
> Message: 75
> Date: Wed, 31 Mar 2004 22:22:22 -0500
> From: Ed L Cashin <ecashin at uga.edu>
> Subject: [R] row selection based on median in data frame
> To: r-help at stat.math.ethz.ch
> Message-ID: <873c7otma9.fsf at uga.edu>
> Content-Type: text/plain; charset=us-ascii
>
> Hi. I am having trouble thinking of an easy way to grab rows out of a
> data frame. I want to select the rows with a median value when the
> rows are similar.
>
> A simple example is this table, which I could read into a data frame.
> I would like to find a new data frame with only the rows with a median
> value for the "c" column given a certain "a" value.
>
> For example, the c values for deadlift rows are 13, 8, and 5, so the
> row with a c value of 8 should show up in the output.
>
> a b c
> 1 deadlift 7 13
> 2 squat 7 24
> 3 clean 7 10
> 4 deadlift 8 8
> 5 squat 8 20
> 6 clean 8 2
> 7 deadlift 9 5
> 8 squat 9 32
> 9 clean 9 19
>
> Result:
>
> a b c
> 4 deadlift 8 8
> 5 squat 8 20
> 3 clean 7 10
>
> It's more complicated in my case, because I have not just one "a"
> column, but about eight columns that have to be the same. I can do
> this with clumsy loops, but I wonder whether there's a better way.
>
> --
> --Ed L Cashin | PGP public key:
> ecashin at uga.edu | http://noserose.net/e/pgp/
>
More information about the R-help
mailing list