[R] Fwd: Documenting data sets with many variables

Gavin Simpson gavin.simpson at ucl.ac.uk
Tue Aug 16 17:26:35 CEST 2005


On Tue, 2005-08-16 at 17:11 +0200, Arne Henningsen wrote:
> On Tuesday 16 August 2005 14:49, Roger D. Peng wrote:
> > Have you tried using 'promptData()' on the data frame and then
> > just using the resulting documentation file?
> 
> Thank you, Roger, for bringing 'promptData()' to my mind. This is really a 
> useful tool. However, in my special case my aim is to reduce the extent and 
> increase the comprehensibility of the documentation rather than to reduce my 
> effort to write the documentation. 
> 
> Any further hints are welcome!
> 
> Thanks,
> Arne

Would it not be expedient then to ignore the \format{} section and just
provide the information on the variables say in the \description{},
e.g.:

This example taken from package vegan describing 2 data.frames with 44
and 14 columns. Admittedly, none of the variables in the species dataset
are explicitly and individually described in this example, but it is
sufficient in this case I think.

\name{varespec}
\alias{varechem}
\alias{varespec}
\docType{data}
\title{Vegetation and environment in lichen pastures}
\usage{
       data(varechem)
       data(varespec)
}
\description{
  The \code{varespec} data frame has 24 rows and 44 columns.  Columns
  are estimated cover values of 44 species.  The variable names are
  formed from the scientific names, and are self explanatory for anybody
  familiar with the vegetation type.
The \code{varechem} data frame has 24 rows and 14 columns, giving the
soil characteristics of the very same sites as in the \code{varespec}
data frame. The chemical measurements have obvious names.
\code{Baresoil} gives the estimated cover of bare soil, \code{Humpdepth}
the thickness of the humus layer.

}
....

HTH

G

> 
> > -roger
> >
> > Arne Henningsen wrote:
> > > Hi,
> > >
> > > since nobody answered to my first message, I try to explain my problem
> > > more clearly and more general this time:
> > >
> > > I have a data set in my R package "micEcon", which has many variables
> > > (82). Therefore, I would like to avoid to describe all variables in the
> > > "\format" section of the documentation (.Rd file). However, doing this
> > > lets "R CMD check" complain about "data codoc mismatches" (details see
> > > below). Is there a way to avoid the description of all variables without
> > > getting a complaint from "R CMD check"?
> > >
> > > Thanks,
> > > Arne
> > >
> > >
> > > ----------  Forwarded Message  ----------
> > >
> > > Subject: Documenting data sets with many variables
> > > Date: Friday 05 August 2005 14:03
> > > From: Arne Henningsen <ahenningsen at email.uni-kiel.de>
> > > To: R-help at stat.math.ethz.ch
> > >
> > > Hi,
> > >
> > > I extended the data set "Blanciforti86" that is included in my R package
> > > "micEcon". For instance, I added consumer prices, annual consumption
> > > expenditures and expenditure shares of eleven aggregate commodity groups.
> > > The corresponding variables in the data frame are called "pAgg1",
> > > "pAgg2", ..., "pAgg11", "xAgg1", "xAgg2", ..., "xAgg11", "wAgg1",
> > > "wAgg2", ..., "wAgg11". To avoid to describe all 33 items in the
> > > "\format" section of the documentation (.Rd file) I wrote something like
> > >
> > > \format{
> > >    This data frame contains the following columns:
> > >    \describe{
> > >       [ . . . ]
> > >       \item{xAggX}{Expenditure on the aggregate commodity group X
> > >          (in Millions of US-Dollars).}
> > >       \item{pAggX}{Price index for the aggregate commodity group X
> > >          (1972 = 100).}
> > >       \item{wAggX}{Expenditure share of the aggregate commodity group X.}
> > >       [ . . . ]
> > >    }
> > > }
> > >
> > > and explained the 11 aggregate commodity groups only once in a different
> > > section (1=food, 2=clothing, ... ). However, "R CMD check" now complains
> > > about "data codoc mismatches", e.g.
> > >   Code: [...] pAgg1pAgg2 pAgg3  [...]
> > >   Docs: [...] pAggX [...]
> > >
> > > Is there a way to avoid the description of all 33 items without getting a
> > > complaint from "R CMD check"?
> > >
> > > Thanks,
> > > Arne
> > >
> > > -------------------------------------------------------
> 
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson                     [T] +44 (0)20 7679 5522
ENSIS Research Fellow             [F] +44 (0)20 7679 7565
ENSIS Ltd. & ECRC                 [E] gavin.simpsonATNOSPAMucl.ac.uk
UCL Department of Geography       [W] http://www.ucl.ac.uk/~ucfagls/cv/
26 Bedford Way                    [W] http://www.ucl.ac.uk/~ucfagls/
London.  WC1H 0AP.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%




More information about the R-help mailing list