[R] Nicely formatted tables
Marc Schwartz
marc_schwartz at comcast.net
Thu Dec 14 23:01:34 CET 2006
On Thu, 2006-12-14 at 16:37 -0500, steve wrote:
> If I use latex(summary(X)) where X is a data frame with four
> variables I get something like
>
> Rainfall Education Popden Nonwhite
> Min. :10.00 Min. : 9.00 Min. :1441 Min. : 0.80
> 1st Qu.:32.75 1st Qu.:10.40 1st Qu.:3104 1st Qu.: 4.95
> Median :38.00 Median :11.05 Median :3567 Median :10.40
> Mean :37.37 Mean :10.97 Mean :3866 Mean :11.87
> 3rd Qu.:43.25 3rd Qu.:11.50 3rd Qu.:4520 3rd Qu.:15.65
> Max. :60.00 Max. :12.30 Max. :9699 Max. :38.50
>
>
> where the row headings are repeated four times times.
> Is there an easy way to get a nicely formatted table,
> something like
>
> Rainfall Education Popden Nonwhite
> Min. 10.00 9.00 1441 0.80
> 1st Qu. 32.75 10.40 3104 4.95
> Median 38.00 11.05 3567 10.40
> Mean 37.37 10.97 3866 11.87
> 3rd Qu. 43.25 11.50 4520 15.65
> Max. 60.00 12.30 9699 38.50
>
>
> Steve
The problem is that summary(), as above, returns a character based
table/matrix. For example, using the 'iris' data set:
> summary(iris[, 1:4])
Sepal.Length Sepal.Width Petal.Length Petal.Width
Min. :4.300 Min. :2.000 Min. :1.000 Min. :0.100
1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 1st Qu.:0.300
Median :5.800 Median :3.000 Median :4.350 Median :1.300
Mean :5.843 Mean :3.057 Mean :3.758 Mean :1.199
3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 3rd Qu.:1.800
Max. :7.900 Max. :4.400 Max. :6.900 Max. :2.500
> str(summary(iris[, 1:4]))
'table' chr [1:6, 1:4] "Min. :4.300 " "1st Qu.:5.100 " ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:6] "" "" "" "" ...
..$ : chr [1:4] " Sepal.Length" " Sepal.Width" " Petal.Length" " Petal.Width"
Hence, the numbers are not separate from the labels, but part of the
table elements.
I might be tempted to construct a better underlying function that just
returned the summary statistics as unformatted numbers in a matrix. It
seems to me that there are such functions, for example in the Hmisc and
the doBY packages, on CRAN. Since you are using latex(), you already
have Hmisc.
That being said, you could brute force something like this:
# See ?strsplit and ?sapply
mat <- matrix(sapply(strsplit(summary(iris[, 1:4]), ":"), "[[", 2),
ncol = 4)
> mat
[,1] [,2] [,3] [,4]
[1,] "4.300 " "2.000 " "1.000 " "0.100 "
[2,] "5.100 " "2.800 " "1.600 " "0.300 "
[3,] "5.800 " "3.000 " "4.350 " "1.300 "
[4,] "5.843 " "3.057 " "3.758 " "1.199 "
[5,] "6.400 " "3.300 " "5.100 " "1.800 "
[6,] "7.900 " "4.400 " "6.900 " "2.500 "
Then add the row and column titles:
rownames(mat) <- c("Min", "1st Qu", "Median", "Mean", "3rd Qu", "Max")
colnames(mat) <- colnames(iris[1:4])
> mat
Sepal.Length Sepal.Width Petal.Length Petal.Width
Min "4.300 " "2.000 " "1.000 " "0.100 "
1st Qu "5.100 " "2.800 " "1.600 " "0.300 "
Median "5.800 " "3.000 " "4.350 " "1.300 "
Mean "5.843 " "3.057 " "3.758 " "1.199 "
3rd Qu "6.400 " "3.300 " "5.100 " "1.800 "
Max "7.900 " "4.400 " "6.900 " "2.500 "
> latex(mat, file = "")
% latex.default(mat, file = "")
%
\begin{table}[!tbp]
\begin{center}
\begin{tabular}{lllll}\hline\hline
\multicolumn{1}{l}{mat}&
\multicolumn{1}{c}{Sepal.Length}&
\multicolumn{1}{c}{Sepal.Width}&
\multicolumn{1}{c}{Petal.Length}&
\multicolumn{1}{c}{Petal.Width}
\\ \hline
Min&4.300 &2.000 &1.000 &0.100 \\
1st Qu&5.100 &2.800 &1.600 &0.300 \\
Median&5.800 &3.000 &4.350 &1.300 \\
Mean&5.843 &3.057 &3.758 &1.199 \\
3rd Qu&6.400 &3.300 &5.100 &1.800 \\
Max&7.900 &4.400 &6.900 &2.500 \\
\hline
\end{tabular}
\end{center}
\end{table}
HTH,
Marc Schwartz
More information about the R-help
mailing list