[R] Need help on dataframe

David L Carlson dcarlson at tamu.edu
Sat Jan 5 20:22:48 CET 2013


This is a slight modification of John's approach using 6 variables and 28
observations:

set.seed(42)
xx  <-  data.frame(aa = 1:28, matrix(sample(1:6, 6*28,  
    replace = TRUE), nrow= 28))
dd  <- ((1:nrow(xx)-1) %/% 7) +1
result <- aggregate(xx[,-1], by=list(dd), FUN=mean)[dd,-1]
result <- data.frame(aa=xx$aa, result)
row.names(result) <- row.names(xx)

----------------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77843-4352

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of John Kane
> Sent: Saturday, January 05, 2013 11:23 AM
> To: Simonas Kecorius; r-help at r-project.org
> Subject: Re: [R] Need help on dataframe
> 
> Well, a rather simple-minded, brute force approach would be to add a
> factor variable to the data frame and use aggregate on it.
> 
> I am sure there are better ways but this will work.
> 
> EXAMPLE
> ###
> xx  <-  data.frame(aa =1:24,
>                      b = matrix(sample(c(1,2,3,4,5,6), 72,  replace =
> TRUE), nrow= 24))
>   dd  <-rep(c("a","b"), each= 12)
> 
>   xx  <-  cbind(dd, xx)
> 
>   aggregate(xx[,3:5], list(xx$dd), mean)
> 
> ################
> 
> By the way, when supplying data samples a good way is to use the dput
> command. Try ?dput for information
> John Kane
> Kingston ON Canada
> 
> 
> > -----Original Message-----
> > From: simolas2008 at gmail.com
> > Sent: Sat, 5 Jan 2013 15:33:03 +0200
> > To: r-help at r-project.org
> > Subject: [R] Need help on dataframe
> >
> > Dear R users, I came up to a problem by taking means (or other
> summary
> > statistics) of a big dataframe.
> >
> > Suppose we do have a dataframe:
> >
> > ID  V1  V2  V3  V4 ........................ V71
> >  1    6     5    3     2  ........................  3
> >  2    3     2    2     1  ........................  1
> >  3    6     5    3     2  ........................  3
> >  4    12   15  3     2  ........................  100
> > ........................................................
> > ........................................................
> > 288 10  20  30   30 .......................... 499
> >
> > I need to find out the way, how to calculate a mean of every 12 lines
> to
> > get:
> >
> > V1                              V2                V3
> V4
> > ........................... V71
> > mean from 1 to 7       same as V1    same as V1
> > mean from 8 to 14     same as V1    same as V1
> > etc.
> >
> > I can do it column by column using:
> >
> > y.ts <- ts(y$V1, frequency=12)
> > aggregate(y.ts, FUN=mean)
> >
> > Bu this is a hardcore... Can anyone suggest a better way to compute
> all
> > the
> > dataframe at once and get a result as matrix?
> >
> > Thank you in advance!
> >
> > --
> > Simonas Kecorius
> > **
> >
> > 	[[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> ____________________________________________________________
> FREE ONLINE PHOTOSHARING - Share your photos online with your friends
> and family!
> Visit http://www.inbox.com/photosharing to find out more!
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list