[R] Nicely formatted tables

Kuhn, Max Max.Kuhn at pfizer.com
Thu Dec 14 23:09:18 CET 2006


How about:

> apply(iris[, 1:4], 2, summary)
        Sepal.Length Sepal.Width Petal.Length Petal.Width
Min.           4.300       2.000        1.000       0.100
1st Qu.        5.100       2.800        1.600       0.300
Median         5.800       3.000        4.350       1.300
Mean           5.843       3.057        3.758       1.199
3rd Qu.        6.400       3.300        5.100       1.800
Max.           7.900       4.400        6.900       2.500

Max 

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Marc Schwartz
Sent: Thursday, December 14, 2006 5:02 PM
To: steve
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] Nicely formatted tables

On Thu, 2006-12-14 at 16:37 -0500, steve wrote:
> If I use latex(summary(X)) where X is a data frame with four
> variables I get something like
> 
>     Rainfall       Education         Popden        Nonwhite    
>  Min.   :10.00   Min.   : 9.00   Min.   :1441   Min.   : 0.80  
>  1st Qu.:32.75   1st Qu.:10.40   1st Qu.:3104   1st Qu.: 4.95  
>  Median :38.00   Median :11.05   Median :3567   Median :10.40  
>  Mean   :37.37   Mean   :10.97   Mean   :3866   Mean   :11.87  
>  3rd Qu.:43.25   3rd Qu.:11.50   3rd Qu.:4520   3rd Qu.:15.65  
>  Max.   :60.00   Max.   :12.30   Max.   :9699   Max.   :38.50  
> 
> 
> where the row headings are repeated four times times.
> Is there an easy way to get a nicely formatted table,
> something like
> 
>         Rainfall     Education   Popden    Nonwhite    
>  Min.     10.00       9.00        1441        0.80  
>  1st Qu.  32.75      10.40        3104        4.95  
>  Median   38.00      11.05        3567       10.40  
>  Mean     37.37      10.97        3866       11.87  
>  3rd Qu.  43.25      11.50        4520       15.65  
>  Max.     60.00      12.30        9699       38.50  
> 
> 
> Steve

The problem is that summary(), as above, returns a character based
table/matrix.  For example, using the 'iris' data set:

> summary(iris[, 1:4])
  Sepal.Length    Sepal.Width     Petal.Length    Petal.Width   
 Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100  
 1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300  
 Median :5.800   Median :3.000   Median :4.350   Median :1.300  
 Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199  
 3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800  
 Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  


> str(summary(iris[, 1:4]))
 'table' chr [1:6, 1:4] "Min.   :4.300  " "1st Qu.:5.100  " ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:6] "" "" "" "" ...
  ..$ : chr [1:4] " Sepal.Length" " Sepal.Width" " Petal.Length" "
Petal.Width"


Hence, the numbers are not separate from the labels, but part of the
table elements. 

I might be tempted to construct a better underlying function that just
returned the summary statistics as unformatted numbers in a matrix. It
seems to me that there are such functions, for example in the Hmisc and
the doBY packages, on CRAN. Since you are using latex(), you already
have Hmisc.

That being said, you could brute force something like this:

# See ?strsplit and ?sapply

mat <- matrix(sapply(strsplit(summary(iris[, 1:4]), ":"), "[[", 2), 
              ncol = 4)

> mat
     [,1]      [,2]      [,3]      [,4]     
[1,] "4.300  " "2.000  " "1.000  " "0.100  "
[2,] "5.100  " "2.800  " "1.600  " "0.300  "
[3,] "5.800  " "3.000  " "4.350  " "1.300  "
[4,] "5.843  " "3.057  " "3.758  " "1.199  "
[5,] "6.400  " "3.300  " "5.100  " "1.800  "
[6,] "7.900  " "4.400  " "6.900  " "2.500  "


Then add the row and column titles:

rownames(mat) <- c("Min", "1st Qu", "Median", "Mean", "3rd Qu", "Max")

colnames(mat) <- colnames(iris[1:4])


> mat
       Sepal.Length Sepal.Width Petal.Length Petal.Width
Min    "4.300  "    "2.000  "   "1.000  "    "0.100  "  
1st Qu "5.100  "    "2.800  "   "1.600  "    "0.300  "  
Median "5.800  "    "3.000  "   "4.350  "    "1.300  "  
Mean   "5.843  "    "3.057  "   "3.758  "    "1.199  "  
3rd Qu "6.400  "    "3.300  "   "5.100  "    "1.800  "  
Max    "7.900  "    "4.400  "   "6.900  "    "2.500  "  


> latex(mat, file = "")
% latex.default(mat, file = "") 
%
\begin{table}[!tbp]
 \begin{center}
 \begin{tabular}{lllll}\hline\hline
\multicolumn{1}{l}{mat}&
\multicolumn{1}{c}{Sepal.Length}&
\multicolumn{1}{c}{Sepal.Width}&
\multicolumn{1}{c}{Petal.Length}&
\multicolumn{1}{c}{Petal.Width}
\\ \hline
Min&4.300  &2.000  &1.000  &0.100  \\
1st Qu&5.100  &2.800  &1.600  &0.300  \\
Median&5.800  &3.000  &4.350  &1.300  \\
Mean&5.843  &3.057  &3.758  &1.199  \\
3rd Qu&6.400  &3.300  &5.100  &1.800  \\
Max&7.900  &4.400  &6.900  &2.500  \\
\hline
\end{tabular}

\end{center}

\end{table}



HTH,

Marc Schwartz

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
----------------------------------------------------------------------
LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}



More information about the R-help mailing list