[R] Averaging within a range of values
Gabor Grothendieck
ggrothendieck at gmail.com
Sat Jan 14 18:23:15 CET 2012
On Fri, Jan 13, 2012 at 6:34 AM, doggysaywhat <chwhite at ucsd.edu> wrote:
> Hello all.
>
> I have two data frames.
> Group Start End
> G1 200 700
> G2 500 1000
> G3 2000 3000
> G4 4000 6000
> G5 7000 8000
>
>
> and
>
> Pos C0 C1
> 200 0.9 0.6
> 500 0.8 0.8
> 800 0.9 0.7
> 1000 0.7 0.6
> 2000 0.6 0.4
> 2500 1.2 0.8
> 3000 0.6 1.5
> 3500 0.7 0.7
> 4000 0.8 0.8
> 4500 0.6 0.6
> 5000 0.9 0.9
> 5500 0.7 0.8
> 6000 0.8 0.7
> 6500 0.4 0.4
> 7000 0.5 0.8
> 7500 0.7 0.9
> 8000 0.9 0.5
> 8500 0.8 0.6
> 9000 0.9 0.8
>
>
> I need to conditionally average all values in columns C0 and C1 based upon
> the bins I defined in the first data frame. For example, for the bin G1 in
> the first dataframe, the values are 200 to 700 so i would average the value
> at pos 200 (0.9) and 500 (0.8) for C0 and then perform the same thing for
> C1.
>
> I can do this in excel with array formulas but I'm relatively new to R and
> would like know if there is a function that will perform the same action. I
> don't know if this will help, but the excel array function I used was
> average(if(range>=start)*(range<=end),range)). Where the range is the
> entire pos column.
>
> Initially I looked at the aggregate function. I can use aggregate when I
> give a single vector to be used for grouping such as (A,B,C) but I'm not
> sure how to define grouping as the bin 200-500 and the second bin as
> 500-1000 etc. and use that as my grouping vector.
>
Here is an sqldf solution where the two input data frames are d1 and
d2 (as in Jeff's post). Note that Group is quoted since its an SQL
keyword:
library(sqldf)
sqldf("select d1.'Group', avg(d2.C0), avg(d2.C1)
from d1, d2
where d2.Pos between d1.Start and d1.End
group by d1.'Group'")
The result is;
Group avg(d2.C0) avg(d2.C1)
1 G1 0.85 0.7000000
2 G2 0.80 0.7000000
3 G3 0.80 0.9000000
4 G4 0.76 0.7600000
5 G5 0.70 0.7333333
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
More information about the R-help
mailing list