ESS and Literate Statistical Practice

A.J. Rossini rossini at blindglobe.net
Thu Aug 16 16:05:10 CEST 2001


>>>>> "MN" == Matthew Nelson <mnelson at esperion.com> writes:

    MN> Thanks.  It is still early in Washington.  Matt

Yep, and I'm in to work, and it's still too early.

But basically...  I based it on noweb, since noweb is a good solid
(and semi-portable) literate programming language.  You'll need noweb
for document generation (to dvi or html, from there to ps/pdf formats,
etc), but you can do most (all!?) of the analysis without it.

The current functionality lets you work out of a noweb buffer, and it
changes modes between LaTeX (AUC-TeX or similar, I think it depends on
what you've mapped "latex-mode" to) and the particular code mode you
want.  Since it sounds like you are using R exclusively, I believe
that there is an option for making that the default code mode.  It's
on one of the pull-down menus, I think.

ESS, with noweb, provides 2 additional "evaluate" commands:

        ess-eval-chunk
        ess-eval-thread

which evaluate a single literate chunk of code (or scrap).  Since
noweb lets you string chunks together, we have the notion of a thread,
for evaluating a set of all chunks.  It's been months since I looked
at the code (and I primarily use this via chunk evaluation) so that
I am not sure of the exact logic that is used for picking chunks in a
thread, but suspect it's of the:
        <<*>>=
        # stuff
        @
type, and running them all together.

So the practical issues:  
1. uncomment the literate programming code in ess-site.el 
2. start with a file with a "*.nw" suffix.
3. consider the rest of this email as an approximate guide.

problems: 
1. no current bindings for ess-eval-chunk or ess-eval-thread.  What do
   you think would work right (or even just reasonably)?
2. I would suggest LaTeX as the documentation language.


Generally, you need to specify the mode that you want at the beginning
of the chunk, i.e.:

        <<*>>=
        ## -*- mode: R -*-
        ## here's a better example of a chunk.  Define the mode in the
        ## first line 
        R.code <- things.to.do()
        more.R.code <- more.things.to.do()
        ## and for indexing, mention objects which are defined/created
        ## in this chunk.
        @ %def R.code more.R.code

which is annoying.  I need to automate it, sometime, in the
insert-chunk function.

"ess-eval-chunk" evaluates the current code and does substitution for
all "called chunks".  "ess-eval-thread" does similar.  Both are
reimplementations of the noweb evaluator in Emacs Lisp (done by Mark
Lunt) so that you don't need noweb.

It has been claimed that you can document using HTML, and I think that
is configurable via Emacs Lisp (to use one of the HTML modes,
hm--html-mode or psgml's html sub-mode).  

On a related note, I've been very slowly working with XAE (XML
Authoring Environment (for Emacs)) integration, for doing DocBook-XML
based literate programming.  I've got a current tag set working, but
it's not up to snuff, yet.  XML is rather burdensome for literate
programming, compared to LaTeX!

LSP is difficult mentally.  The approach that works best is the
"perfect approach", i.e. doing all the steps in order for a
statistical consult or analysis.  You have to think about it, it turns
out to be rather inefficient if you barge ahead like a bull in a china
shop.  It's not really a "prototype and throw away approach", which is
another programming methodology (read: Extreme Programming) that I'm
evaluating as a form of "statistical practice".
 
However, in my personal experience, in statistical consulting, it does
result in much better work and results (mostly from the thought and
discipline required) for data analysis, since after all, you've
front-loaded the difficult part rather than postponing it
(i.e. thought about the issues and specify the analysis plan, before
writing).

If you have any questions or comments, let me know.

best,
-tony

-- 
A.J. Rossini				Rsrch. Asst. Prof. of Biostatistics
U. of Washington Biostatistics		rossini at u.washington.edu	
FHCRC/SCHARP/HIV Vaccine Trials Net	rossini at scharp.org
-------- (wednesday/friday is unknown) --------
FHCRC: M-Tu : 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email
UW:    Th   : 206-543-1044 (fax=3286)|Change last 4 digits of phone to FAX

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex1.nw
Type: application/octet-stream
Size: 742 bytes
Desc: Simple Silly Noweb/ESS example for playing with
URL: <https://stat.ethz.ch/pipermail/ess-help/attachments/20010816/8d8771d8/attachment.obj>


More information about the ESS-help mailing list