[R] page boundaries for latex printing of summary.formula objects in Hmisc

Erik Iverson eriki at ccbr.umn.edu
Mon Mar 8 23:10:39 CET 2010


Warning, I'm guessing only those who have used the Hmisc package's 
summary.formula function with LaTeX will be able to offer much help here.

I am using the Hmisc package's summary.formula function to produce 
tables for a LaTeX report.  The "latex" function in the same package 
supports longtables in LaTeX.  Ideally, I would like for page breaks in 
the LaTeX output to only occur at variable boundaries.

## Sample R code

## load the Hmisc package

## create an example data.frame
test.df <- data.frame(sex = gl(2, 110, labels = c("Male", "Female")),
         fac1 = sample(gl(11, 100, labels = paste("V1 Level", 1:11))),
         fac2 = sample(gl(11, 100, labels = paste("V2 Level", 1:11))),
         fac3 = sample(gl(11, 100, labels = paste("V3 Level", 1:11))),
         fac4 = sample(gl(11, 100, labels = paste("V4 Level", 1:11))),
         fac5 = sample(gl(11, 100, labels = paste("V5 Level", 1:11))),
         fac6 = sample(gl(11, 100, labels = paste("V6 Level", 1:11))))

## create the summary.formula object
sf <- summary.formula(sex ~ fac1 + fac2 + fac3 + fac4 + fac5 + fac6,
                       data = test.df,
                       method = "reverse")

## print out the LaTeX code to the screen, not a file
latex(sf, file = "", longtable = TRUE)

Notice how the LaTeX output puts the newline in the middle of factor 4, 
instead of before or after.

<excerpt of LaTeX ouput follows>


I expect this given the documentation in ?latex of lines.page, which is 
set to 40 by default.

Applies if ‘longtable=TRUE’. No more than ‘lines.page’
           lines in the body of a table will be placed on a single page.
           Page breaks will only occur at ‘rgroup’ boundaries.

The problem is that variable boundaries don't in general correspond to 
constants, like 40 lines.

So, rgroup sounds promising.  I want the lines per variable to 
correspond to be the n.rgroup values, but since my tables are dynamic, 
in that the variables and number of levels in them change over time, I 
can't think of a way to define n.rgroup without specifying it per 
variable.  My first thought was to compute it from the number of levels 
per variable in the formula.

In fact, I did try this but immediately ran into some misconceptions I 
had about how continuous variables are represented internally within the 
latex function.

Is there any easier way to accomplish this breaking of pages on variable 
boundaries using this set of functions?  I suspect not, but thought I'd 
ask.  I think I can figure out the approach I suggested in the preceding 
2 paragraphs, but just want to make sure I'm not missing something ...

Thanks a lot!
Erik Iverson

More information about the R-help mailing list