[R] Tables with Graphical Representations

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Fri Sep 1 16:56:54 CEST 2006


On 31-Aug-06 Sam Ferguson wrote:
> Hi useRs -
> 
> I was wondering if anyone out there can tell me where to find
> R-code to do mixes of tables and graphics. I am thinking of
> something similar to this:
> http://yost.com/information-design/powerpoint-corrupts/
> or like the excel routines people are demonstrating:
> http://infosthetics.com/archives/2006/08/excel_in_cell_graphing.html
> 
> My aim is to provide small graphics to illustrate numbers directly  
> beside or behind their position in the table. Maybe there is a way
> to do it with lattice?
> 
> Thanks for any help you may be able to provide.
> Sam Ferguson

I dare say there may be a way to do that kind of thing directy within R,
and if so then the graphics experts will no doubt tell us how!

But your examples are just one kind of combined tabular/graphic layout
(and somewhat similar to each other). In a more general context of
combining tables of numerical results with graphic displays, it is
perhaps better to think in terms of using R to produce the numerical
results in the first instance, and then handing these over to software
designed for general-purpose graphical/textual layout. You then have
complete control, and full flexivility of design.

Indeed, in your second (Excel) example, the method of production is
just a nasty kludge -- and it was a happy coincidence that the "REPT"
function was available in Excel at all!

As Frank Harrell has just posted (just as I was completing this one!),
you can do this sort of thing in LaTex (his example shows little
histograms of the data, above each different tabular section). LaTex
is an example of software which allows you to create precisely formatted
graphics within precisely formatted text.

However, I'm no expert on LaTex, preferring what I've been used to for
too many years, namely Unix 'troff' and its more recent GNU implementation
'groff'.

As a preliminary, you will need to get R to output a suitable data
file, or a suitably composed data file with 'groff' formatting tags
interspersed. The latter should not be difficult, though my own approach
would be to simply take a data file of the form (for your first example
as taken from your URL):

"% survival / standard error" "5 year" "10 year" "15 year" "20 year"
"Prostate" 98.8 0.4 95.2 0.9 87.1 1.7 81.3 3.0
"Thyroid" 96.0 0.8 95.8 1.2 94.0 1.6 95.4 2.1
"Testis" 94.7 1.1 94.0 1.3 91.1 1.8 88.2 2.3
[...]

(which would be very straightforward in R) and then use say 'awk'
to compute 'groff' data with embedded tags (see below).

The file which I would then submit to 'groff' would look like



.ds RED "\X'ps: exec 1 0 0 setrgbcolor'
.ds GREY "\X'ps: exec 0.5 0.5 0.5 setrgbcolor'
.ds BLACK "\X'ps: exec 0 0 0 setrgbcolor'
.ds bx \x'-0.2m'\x'-0.2m'\v'0.2m'\Z'\
\*[RED]\D'P \\$1p 0 0 -1m -\\$1p 0 0 1m'\
'\
\Z'\
\h'\\$1p'\
\*[GREY]\D'P 0.5i-\\$1p 0 0 -1m \\$1p-0.5i 0 0 1m'\
'\h'0.5i'\
\v'-0.2m'\*[BLACK]
.LP
.TS
box tab(#);
c3 s1 s1w(0.5i) s s1 s1w(0.5i) s s1 s1w(0.5i) s s1 s1w(0.5i) s.

\f[BMB]\s[15]Estimated survival rates by cancer site\s0\fP

.T&
l c s s s s s s s s s s s.
#\fB\s[12]% survival / standard error\s0\fP
#\_
.T&
l c s s c s s c s s c s s.
#5 year#10 year#15 year#20 year
#\_#\_#\_#\_
.T&
l  n l n n c n n c n n c n.
Prostate#98.8#\*[bx 35.6]#0.4#95.2#\*[bx 34.3]#0.9#87.1#\
\*[bx 31.4]#1.7#81.3#\*[bx 29.3]#3.0
Thyroid#96.0#\*[bx 34.6]#0.8#95.8#\*[bx 34.5]#1.2#94.0#\
\*[bx 33.8]#1.6#95.4#\*[bx 34.3]#2.1
Testis#94.7#\*[bx 34.1]#1.1#94.0#\*[bx 33.8]#1.3#91.1#\
\*[bx 32.8]#1.8#88.2#\*[bx 31.8]#2.3
[...]
Pancreas#4.0#\*[bx 1.4]#0.5#3.0#\*[bx 1.1]#1.5#2.7#\
\*[bx 1.0]#0.6#2.7#\*[bx 1.0]#0.8

.TE



The key here is to define a "parametrised string" which will
be invoked as "\*[bx <number>]". The is the main "embedded tag".

Each box is 0.5 inch wide (36 points), and consists of a lefthand
section in Red which width is 36*percent/100 points, with a
rigthand section in Grey whose width is 36*(1 - percent/100) points.
The height of the box is 1 em (which, in points, is the point-size
of the current font), and the box has been shifted downwards slightly
(0.2 2m) to align it nicely with the text. The parameter "<number>"
in "\*[bx <number>]" is the value of 36*percent/100. So this can, for
instance, be easily computed in an 'awk' run.

The block of "code"

.ds bx \x'-0.2m'\x'-0.2m'\v'0.2m'\Z'\
\*[RED]\D'P \\$1p 0 0 -1m -\\$1p 0 0 1m'\
'\
\Z'\
\h'\\$1p'\
\*[GREY]\D'P 0.5i-\\$1p 0 0 -1m \\$1p-0.5i 0 0 1m'\
'\h'0.5i'\
\v'-0.2m'\*[BLACK]

defines the tag "\*[bx ...]", which is responsible for drawing the
graphical item ion the table wherever it is invoked. Initailly it
is padded above an below with a bit of extra space ("\x...") and
moved down slightly ("\v'0.2m'"), then colour changes to Red and
a filled Red polygon is drawn; then the drawing point is shifted
and a filled Grey polygon is drawn. Finally the colour is changed
back to Black for the text part of the Table. The value of "<number>"
is substituted for "\\$1" wherever this occurs in the definition
of "bx".

The line ".TS" leads in to a Table definition, which ends with ".TE".
The next few lines specifiy table layout (types, spacings and
widths of columns, cell separator "#", etc.); and then come the
data for each line of the table, in which the box tag "\*[bx ...]"
occurs where needed. As indicated above, the full table data could
probably be easily computed in R and can certainly be easily done
in 'awk' or 'perl'.

After all that, the result is quite pleasing -- and, when I compare
it with the graph shown on Sam's URL, it seems to me to represent
the numbers much more accurately, as well as being visually slightly
more expressive.

It would also be quite feasible to "complicate" the graphics with
indications of SE etc., by adding more to the definition of \*[bx ...].

I have looked at the "LaTeX file produced by lstex.describe" for
Frank Harrell's example. Granting that it has no doubt been automatically
produced, it is enormous and, for practical purposes, uneditable if
you want to tweak features of the display. It would be interesting
to see what had to be down further back up the line to produce it;
this might be, of course, much easier to tweak. On the other hand,
my 'groff source' file above is compact and easily changed.

If anyone would like to look at the output I have produced by the
above method (PDF file), and the full groff source file, drop me a
line (I'll send them privately to Sam anyway).

Best wishes to all,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 01-Sep-06                                       Time: 15:56:46
------------------------------ XFMail ------------------------------



More information about the R-help mailing list