[R] R and Data Storage

Frank E Harrell Jr f.harrell at vanderbilt.edu
Sat Oct 1 14:35:00 CEST 2005


rab45+ at pitt.edu wrote:
> Where I work a lot of people end up using Excel spreadsheets for storing
> data. This has limitations and maybe some less than obvious problems. I'd
> like to recommend a uniform way for storing and archiving data collected
> in the department. Most of the data could be stored in simple csv type
> files but it would be nice to have something that stores more information
> about the variables and units. netcdf seems like overkill (and not easy
> for casual users). Same for postgres and mysql databases. Could someone
> recommend some system for storing relatively small data sets (50-100
> variables, <1000 records) that would be reliable, safe, and easy for
> people to view and edit their data that works nicely with R and is open
> source? Am I asking for the moon?
> 
> Rick  B.

What I use is the facilities in the Hmisc package, which handles 
variable labels and units of measurement and has functions for importing 
data (saving labels in the appropriate place) and making use of the 
attributes (e.g., combining labels and units with a smaller font for the 
units portion in an axis label).  When such an annotated data frame is 
saved using save(...., compress=TRUE), load()'ing it back will provide 
an annotated data frame, quickly.  The contents( ) function can show the 
attributes, and we use html(contents( )) to put up a web page with 
hyperlinks for value labels (factor variable levels attribute).

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list