[R] Summary Statistics for data.frame

Duncan Murdoch murdoch at stats.uwo.ca
Sat Jul 8 22:00:44 CEST 2006


On 7/8/2006 3:44 PM, justin rapp wrote:
> I apologize for my constant questions but I am new to R and trying to
> gain an appreciation for its capabilities.  The following task is easy
> in Excel and I was hoping somebody could give me a quick explanation
> for how it can be acheived in R so I can avoid having to switch
> between the two applications.
> 
> How do I find the Summary Statistics in one Vector of the dataframe by
> levels in another of the vectors.
> 
> For example, I have the following headings for my data.frame.
> Conference
> Year Drafted
> Height
> Weight
> Ratio
> 
> I would like to see compute the mean Height, Weight, and Ratio as well
> as their variances for each of the years under Year
> Drafted(1980-2000).  What is the most efficient way of doing this?

I think the quickest is

by(mydf, mydf$Year, summary)

but this won't give you the variance.  You'll need your own little 
function to calculate mean and variance, e.g.

mysummary <- function(df) apply(df, 2,
                function(x) c(mean=mean(x), variance=var(x)))

by(mydf, mydf$Year, mysummary)

If you don't like the format of the output, you can play around with the 
mysummary function.  It will be applied to each subset of the 
data.frame, and the results will be put together into a list with one 
entry per level of mydf$Year.


Duncan



More information about the R-help mailing list