[R] Standardizing the number of records by group

Sam Albers tonightsthenight at gmail.com
Mon Jul 25 21:24:37 CEST 2011


Hello R-help,

I have some data collected at regular intervals but for a varying
length of time. I would like to standardize the length of time
collected and I can do this by standardizing the number of records I
use for my analysis.

Take for example the data set below:


library(plyr)
x <- runif(18,10, 15)
df <- as.data.frame(x)
df$fac <- factor(c("Test1","Test1","Test1","Test1","Test1","Test1","Test1",
                 "Test2","Test2","Test2","Test2","Test2",
                 "Test3","Test3","Test3","Test3","Test3","Test3"))

## Here is where I would like to standardize the number of records

df.avg <- ddply(df, c("fac"), function(df) return(c(x.avg=mean(df$x),
n=length(df$x))))
df.avg

Here there is a different number of records for each factor level. Say
I only wanted to use the first 4 records at each factor level. Prior
to taking the mean of these values how might I drop all the records
after 4? Can anyone suggest a good way to do this?

I am using R 2.12.1 and Emacs + ESS.

Thanks so much in advance.

Sam



More information about the R-help mailing list