[R] R and Data Storage
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Sat Oct 1 14:35:00 CEST 2005
rab45+ at pitt.edu wrote:
> Where I work a lot of people end up using Excel spreadsheets for storing
> data. This has limitations and maybe some less than obvious problems. I'd
> like to recommend a uniform way for storing and archiving data collected
> in the department. Most of the data could be stored in simple csv type
> files but it would be nice to have something that stores more information
> about the variables and units. netcdf seems like overkill (and not easy
> for casual users). Same for postgres and mysql databases. Could someone
> recommend some system for storing relatively small data sets (50-100
> variables, <1000 records) that would be reliable, safe, and easy for
> people to view and edit their data that works nicely with R and is open
> source? Am I asking for the moon?
>
> Rick B.
What I use is the facilities in the Hmisc package, which handles
variable labels and units of measurement and has functions for importing
data (saving labels in the appropriate place) and making use of the
attributes (e.g., combining labels and units with a smaller font for the
units portion in an axis label). When such an annotated data frame is
saved using save(...., compress=TRUE), load()'ing it back will provide
an annotated data frame, quickly. The contents( ) function can show the
attributes, and we use html(contents( )) to put up a web page with
hyperlinks for value labels (factor variable levels attribute).
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list