[R] Nicely formatted tables
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Thu Dec 14 22:58:41 CET 2006
steve wrote:
> If I use latex(summary(X)) where X is a data frame with four
> variables I get something like
>
> Rainfall Education Popden Nonwhite
> Min. :10.00 Min. : 9.00 Min. :1441 Min. : 0.80
> 1st Qu.:32.75 1st Qu.:10.40 1st Qu.:3104 1st Qu.: 4.95
> Median :38.00 Median :11.05 Median :3567 Median :10.40
> Mean :37.37 Mean :10.97 Mean :3866 Mean :11.87
> 3rd Qu.:43.25 3rd Qu.:11.50 3rd Qu.:4520 3rd Qu.:15.65
> Max. :60.00 Max. :12.30 Max. :9699 Max. :38.50
>
>
> where the row headings are repeated four times times.
> Is there an easy way to get a nicely formatted table,
> something like
>
> Rainfall Education Popden Nonwhite
> Min. 10.00 9.00 1441 0.80
> 1st Qu. 32.75 10.40 3104 4.95
> Median 38.00 11.05 3567 10.40
> Mean 37.37 10.97 3866 11.87
> 3rd Qu. 43.25 11.50 4520 15.65
> Max. 60.00 12.30 9699 38.50
>
>
>
Hmm, no. Not without further ado. The function summary.data.frame
produces a table with character entries like "Min. : 1.00 ".
To do better, you first have to note that it can only possibly work for
purely numeric data frames. If you have one of those, then you might
base something off sapply(X, summary), except that it won't work if only
some columns have NA's. Here's an idea:
> my.summary <- function(x){s <- summary(x); if (length(s)==6)
c(s,"NA's"=0) else s}
> sapply(airquality,my.summary)
Ozone Solar.R Wind Temp Month Day
Min. 1.00 7.0 1.700 56.00 5.000 1.0
1st Qu. 18.00 115.8 7.400 72.00 6.000 8.0
Median 31.50 205.0 9.700 79.00 7.000 16.0
Mean 42.13 185.9 9.958 77.88 6.993 15.8
3rd Qu. 63.25 258.8 11.500 85.00 8.000 23.0
Max. 168.00 334.0 20.700 97.00 9.000 31.0
NA's 37.00 7.0 0.000 0.00 0.000 0.0
However, there's an issue with the NA count getting displayed to
three decimal places...
More information about the R-help
mailing list