[R] Summary statistics across factor levels

Frank E Harrell Jr f.harrell at vanderbilt.edu
Wed Apr 30 12:56:19 CEST 2008


Lauri Nikkinen wrote:
> R users,
> 
> I intention is to calculate some summary statistics across factor
> levels. I know that in Hmisc package there is a summary function which
> produces neat summary statistics when using "cross" option. I would
> like to produce similar output with N and Missing columns but produce
> a data.frame. Is there any built-in function for that?

Take a look at the Hmisc summarize function.

> 
> #example data
> install.packages("Hmisc")
> library(Hmisc)
> sek <- seq(1, nrow(Indometh), 9)
> Indometh$time[sek] <- NA
> Indometh$timeclass <- factor(cut(Indometh$time, breaks=c(0,2,4,6,8,10)))
> Indometh
> with(Indometh, summary(conc ~ Subject + timeclass, method="cross"))

If using summary I suggest summary(conc ~ Subject + cut2(time, 
c(0,2,4,6,8,10)), data=Indometh, method='cross')

Frank

> 
> #similar with aggregate and reshape but no N or Missing count
> i.mean <- aggregate(Indometh$conc, list(Indometh$Subject,
> Indometh$timeclass), mean, na.rm=T)
> i.mean.rhsp <- reshape(i.mean, v.names="x", idvar="Group.1",
> timevar="Group.2", direction="wide")
> i.mean.rhsp  # N and missing columns needed
> 
> Thanks,
> Lauri
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University



More information about the R-help mailing list