Programming with R for
Reproducible Research
Course Synopsis
See Overview page. Eventually more details at: Course Catalogue Data)
Prerequisites
- Both parts of
"Using R for Data Analysis" (lecture in fall semester),
or similar knowledge of R on at least an *intermediate* (" > beginner") level
- Laptop with R (>= 3.0.1) and one of RStudio / StatET / ESS, (or similar) "R IDE" installed. Two students may team up, using one computer.
- One semester of (introduction to) statistics
Start of lectures
Tuesday, 18.02.2014
Lecture material
- Week 1:
- Organization, Topics, etc: Emacs org (source), pdf.
R code during browsing of "Using R.." chapter 7: 2014-02-18-ex.R.
R markdown ("Rmd") file first.Rmd, conveniently opened in Rstudio, demonstrating both R markdown with its HTML, i.e. web content output. The resulting (first.Rmd --> first.md --> first.html) is currently available as public web page on Rpubs.
- Week 2:
- Questions about "Using R"...
- && vs & and || vs | -- ?Logic
- coercion: ?c ?Extract
- More on functions, notably closures:
The 2 (and three) parts of a function: Commented R code.
- Using the R code from Matloff's book.
Our edition of original Ch7/envexample1.R. Using Rstudio's "compile as notebook" (*.R -> *.Rmd -> *.html, the latter with knitr) gives envexample1+ splinefun.
- Week 3:
- - Our edition of original Ch7/bookvec.R
- our example "text corpus" text1.txt
- Our version of original Ch4/findwords.R; and the (more efficient!)
split() version of original Ch6/findwords.R. Note seq_along(.)
- Our modified excerpt of H.Wickham's functional programming chapter.
- Week 4:
- - [continuing "functional programming" (week 3, above)]
- The initial R session, somewhat extended, of How R Searches and Finds Stuff
- Functions -> environments:
- ls(), get(), assign(), find(), ls.str(), new.env(), parent.env(),
globalenv(), emptyenv(), and the first two figures in How R Searches and Finds Stuff
- Week 5:
- - one_counter() example in "functional programming"
- Reproducible research in action: Frank Harrell's new 'greport' package, install it, and see Greport for LaTeX setup
- "R is slow" etc:
-- "Premature optimization is the root of all evil", Donald Knuth
Rather: Test, test, and test again; using all.equal(target, current, tolerance ~= 10^-8) // Good R packages do -> sub directory './tests/'
-- typical issue about for() loop from Stackoverflow
--> functions system.time() and proc.time()
-- Current issue on 'matrix vs. data.frame' on the R-help mailing list.
-- Hadley's chapter "Performance" (updated, March 25)
- Week 6:
- - Continuing "Performance" (see week 5): "Measure, don't guess" --> Using Rprof() and microbenchmark
- The 'matrix vs. data.frame' R-help example (see above). Bill Dunlap's solution + more, as Rmd script
- R's byte compiler (-> require("compiler"); ?cmpfun), see in the above *.Rmd
- Start looking at R packages, source and "binary", see Notes below (week 7).
- Week 7:
- - R packages, in source and "binary"; browsing Notes of "Package writing" course
- package.skeleton()
- Understand more of How R Searches and Finds Stuff.
- Extras:
- - a small script to get all methods of a generic function, nicely in a list, hidden or not.
- from lapply() to parLapply(): R's builtin package 'require(parallel)'
-
Lecture attestation (Testat):
In order to obtain the ECTS credit you
have to pass the exam -- answering some questions, and writing R code - in a *.Rmd (R Markdown file) at the end of the teaching block,
specifically on April 15.
Recommended Reading
Norman Matloff (2011) The Art of R Programming - A tour of statistical software design.
no starch press, San Francisco. on stock at Polybuchhandlung (CHF 42.-); see online for data, R code.
Hadley Wickham (2013 ff) Advanced R, online
more advanced than our course; partly focused on his own packages
Suraj Gupta (March 29, 2012) How R Searches and Finds Stuff, online;
Tough read, but helpful with its nice illustrations. Do consider Duncan Murdoch's note about it with minor caveats.
Miscellaneous on Programming (with R)
- "Literate Programming" by Donald Knuth
- "The Elements of Programming Style" by Kernighan and Plauger: Wikipedia, ,
Quotes