[R] Grouping and Computing

Warnes, Gregory R gregory_r_warnes at groton.pfizer.com
Thu Feb 7 17:15:16 CET 2002


Generally, the data frame is the most useful object for storing data.  It
allows each column to have a different type (factor, numeric, ...).

This is the default object type returned from read.table(), read.csv(), etc.

If you have your data in a data file named "data.txt" in the format

Group  Value1  Value2
A	  1       2
A       1       3
B       2       3
B       1       3
C       1       1
...


you can read it into R with

data <- read.table("data.txt",header=T)

Now, to get the summary you want, you should 

  1) create a function to compute the summary for a data.frame containing
only the data for one group.  Something like

	compute.summary <- function(x)
            sum( x$Value1 / x$Value2 )

  2) Use 'split' to break the data frame into one chunk per group, and
'sapply' to call your function on each chunk:

	tmp <- split( data, data$Group )

	results <- sapply( tmp, compute.summary )

You will probably want to look at the help pages for read.table, split, and
sapply.  You should also (if you haven't already) picked up the manual 'An
Introduction to R' from http://cran.r-project.org/manuals.html

-Greg

> -----Original Message-----
> From: alexander.hener at gmx.de [mailto:alexander.hener at gmx.de]
> Sent: Thursday, February 07, 2002 9:03 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Grouping and Computing
> 
> 
> Hi group,
> 
> To mention it in advance, I am an R newbie, and most likely, 
> my question is
> more a mix of smaller, simpler tasks. Anyway, I got mixed up 
> between by,
> select, aggregate, lapply etc.
> My problem is as follows : 
> 
> I have read data in and transformed them into a matrix for no 
> special reason
> so far. This matrix contains a column with regard to which I 
> would like to
> group, i.e. one realisation specifies one group. Neither the number of
> occurences nor the value of these realisations is known in 
> advance, which seems to
> be the mayor problem. For each group separately then, I would 
> like to compute
> some aggregation function, namely the sum of a fraction of 
> two columns. These
> sums should be kept in form of another vector. 
> 
> My two questions are then
> 
> - Which object type (matrix, dataframe, list) lends itself to such a
> problem?
> - Do I have to create different objects for the groups, or 
> can I compute the
> vector of sums directly? And how?
>  
> Thanks in advance
> 
> Alexander Hener
> 
> -- 
> GMX - Die Kommunikationsplattform im Internet.
> http://www.gmx.net
> 
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
> -.-.-.-.-.-.-.-.-
> r-help mailing list -- Read 
http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._


LEGAL NOTICE
Unless expressly stated otherwise, this message is confidential and may be privileged. It is intended for the addressee(s) only. Access to this E-mail by anyone else is unauthorized. If you are not an addressee, any disclosure or copying of the contents of this E-mail or any action taken (or not taken) in reliance on it is unauthorized and may be unlawful. If you are not an addressee, please inform the sender immediately.
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list