[R] Nicely formatted summary table with mean, standard deviation or number and proportion
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Mon May 14 04:11:51 CEST 2007
Keith Wong wrote:
> Dear all,
>
> The incredibly useful Hmisc package provides a method to generate
> summary tables that can be typeset in latex. The Alzola and Harrell book
> "An introduction to S and the Hmisc and Design libraries" provides an
> example that generates mean and quartiles for continuous variables, and
> numbers and percentages for count variables: summary() with method =
> 'reverse'.
>
> I wonder if there is a way to change it so the mean and standard
> deviation are reported instead for continuous variables.
>
> I illustrate my question below using an example from the book.
>
> Thank you.
>
> Keith
Newer versions of Hmisc have an option to add mean and SD for
method='reverse'. Quartiles are always there.
Frank
>
>
> > ####
> > library(Hmisc)
> >
> > set.seed(173)
> > sex = factor(sample(c("m", "f"), 500, rep = T))
> > age = rnorm(500, 50, 5)
> > treatment = factor(sample(c("Drug", "Placebo"), 500, rep = T))
> > summary(sex ~ treatment, fun = table)
> sex N=500
>
> +---------+-------+---+---+---+
> | | |N |f |m |
> +---------+-------+---+---+---+
> |treatment|Drug |263|140|123|
> | |Placebo|237|133|104|
> +---------+-------+---+---+---+
> |Overall | |500|273|227|
> +---------+-------+---+---+---+
> >
> >
> >
> > (x = summary(treatment ~ age + sex, method = "reverse"))
> > # generates quartiles for continuous variables
>
>
> Descriptive Statistics by treatment
>
> +-------+--------------+--------------+
> | |Drug |Placebo |
> | |(N=263) |(N=237) |
> +-------+--------------+--------------+
> |age |46.5/49.9/53.2|46.7/50.0/53.4|
> +-------+--------------+--------------+
> |sex : m| 47% (123) | 44% (104) |
> +-------+--------------+--------------+
> >
> >
> > # latex(x) generates a very nicely formatted table
> > # but I'd like "mean (standard deviation)" instead of quartiles.
>
>
>
> > # this function from
> http://tolstoy.newcastle.edu.au/R/e2/help/06/11/4713.html
> > g <- function(y) {
> + s <- apply(y, 2,
> + function(z) {
> + z <- z[!is.na(z)]
> + n <- length(z)
> + if(n==0) c(NA,NA,NA,0) else
> + if(n==1) c(z, NA,NA,1) else {
> + m <- mean(z)
> + s <- sd(z)
> + c(N=n, Mean=m, SD=s)
> + }
> + })
> + w <- as.vector(s)
> + names(w) <- as.vector(outer(rownames(s), colnames(s), paste, sep=''))
> + w
> + }
>
> >
> > summary(treatment ~ age + sex, method = "reverse", fun = g)
> > # does not work, 'fun' or 'FUN" argument is ignored.
>
>
> Descriptive Statistics by treatment
>
> +-------+--------------+--------------+
> | |Drug |Placebo |
> | |(N=263) |(N=237) |
> +-------+--------------+--------------+
> |age |46.5/49.9/53.2|46.7/50.0/53.4|
> +-------+--------------+--------------+
> |sex : m| 47% (123) | 44% (104) |
> +-------+--------------+--------------+
> >
> >
> > (x1 = summarize(cbind(age), llist(treatment), FUN = g,
> stat.name=c("n", "mean", "sd")))
> treatment n mean sd
> 1 Drug 263 49.9 4.94
> 2 Placebo 237 50.1 4.97
> >
> > # this works but table is rotated, and it count data has to be
> > # treated separately.
>
>
>
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list