[R] Documenting data

Christopher W Ryan cryan at binghamton.edu
Thu Jun 30 17:39:24 CEST 2016


Pito--

You describe excellent practices.

The R code itself, saved as a script, provides some documentation of how you got from original data to wherever you are.

Use # comments liberally. 

Whenever possible, save your raw data, however it was when you got it--avoid changing it--make all the changes on the objects in R. 

Have you looked into various "reproducible research" systems for R, like Sweave or knitr?  They allow you to include analysis code and text of a manuscript or report all together in one file.

Christopher W. Ryan
sent from my phone with BlueMail



On Jun 30, 2016, 11:30, at 11:30, Pito Salas <pitosalas at brandeis.edu> wrote:
>I am studying statistics and using R in doing it. I come from software
>development where we document everything we do.
>
>As I “massage” my data, adding columns to a frame, computing on other
>data, perhaps cleaning, I feel the need to document in detail what the
>meaning, or background, or calculations, or whatever of the data is.
>After all it is now derived from my raw data (which may have been well
>documented) but it is “new.” 
>
>Is this a real problem? Is there a “best practice” to address this?
>
>Thanks!
>
>Pito Salas
>Brandeis Computer Science
>Feldberg 131
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]



More information about the R-help mailing list