[R] Calculating a mean based on a factor range

Jim Lemon jim at bitwrit.com.au
Fri Jun 10 10:31:51 CEST 2011


On 06/10/2011 08:10 AM, Sam Albers wrote:
> Hello all,
>
> I have been using an instrument that collects a temperature profile of a
> water column. The instrument records the temperature and depth any time it
> takes a reading. I was sampling many times at discrete depth rather than a
> complete profile of the water column (e.g. interested in 5m, 10m and 20m
> depth position only). The issue was that these measurement were taken with
> the instrument hanging off the side of a boat so a big enough wave moved the
> instrument such that it recorded a slightly different depth. For my
> purposes, however, this difference is negligible and I wish to consider all
> those different readings at close depth as a single depth. So for example:
>
>
>> library(ggplot2)
>>
>> eg<- read.csv("http://dl.dropbox.com/u/1574243/example_data.csv",
> header=TRUE, sep=",")
>>
> ## Calculating an average value from all the readings for each depth reading
>> eg.avg<- ddply(eg, c("site", "depth"), function(df)
> return(c(temp=mean(df$temperature),
> +
> num_samp=length(df$temperature)
> +                                                          )))
>>
> ## An example of my problem
>> eg.avg[eg.avg$num_samp>10&  eg.avg=="Station 3",]
>           site    depth     temp num_samp
> 154 Station 3  1.09000 4.073667       30
> 159 Station 3  2.49744 3.950972       72
> 175 Station 3  7.96332 3.903188       69
> 208 Station 3 19.37708 4.066393       61
> 209 Station 3 19.54096 4.025385       13
>
> ## So here you will notice that record 208 and 209, by my criteria, should
> be considered a sample at the same depth and lumped together. Yet I can't
> figure out a way to coerce R to calculate a mean value of temperature based
> on a threshold range depth (say +/- 0.25). Generally speaking this can be
> said to be calculating a mean (temperature) based on a factor (depth) range.
>
Hi Sam,
I suspect that you have a set of "nominal depths" in which you are 
interested. If so, I would suggest using "cut" as Peter suggested, but 
specifying the breaks like this:

depth_cat<-cut(depth,breaks=c(0,2.5,7.5,15,30),
  labels=c("0","5","10","25"))

So that the values nearest to your "nominal depths" will be aggregated 
into the correct categories.

Jim



More information about the R-help mailing list