Programming with R for
Reproducible Research
Course Synopsis
See Overview page. Eventually more details at: Course Catalogue Data)
Prerequisites
- Both parts of
"Using R for Data Analysis" (lecture in fall semester),
or similar knowledge of R on at least an *intermediate* (" > beginner") level
- Laptop with R (>= 3.0.1) and one of RStudio / StatET / ESS, (or similar) "R IDE" installed. Two students may team up, using one computer.
- One semester of (introduction to) statistics
Start of lectures
Tuesday, 23.02.2016
Lecture material
- Week 1:
- - Organization, Topics, etc: Emacs org (source), pdf.
- R code during browsing of "Using R.." chapter 7: 2014-02-18-ex.R.
- R markdown ("Rmd") file first.Rmd, conveniently opened in Rstudio, demonstrating both R markdown with its HTML, i.e. web content output. The resulting (first.Rmd --> first.md --> first.html) is currently available as public web page on Rpubs.
- "Everything (in R) is an object" --> explorations and a table [Rpubs] and its Rmd source
- Week 2:
- - Your questions about "Using R" (and the material above) : ? ...
- && vs & and || vs | -- ?Logic
- coercion: ?c ?Extract
- More on functions, notably closures:
The 2 (and three) parts of a function: Commented R code and R markdown.
- Excursion: Exploring R packages and functions in there: Sweave from "Using R" or just the "one" slide
- Using the R code from Matloff's book.
Our edition of original Ch7/envexample1.R. Using Rstudio's "compile as notebook" (*.R -> *.Rmd -> *.html, the latter with knitr) gives envexample1+ splinefun.
- Week 3:
- - Our edition of original Ch7/bookvec.R
- our example "text corpus" text1.txt
- Our version of original Ch4/findwords.R; and the (more efficient!)
split() version of original Ch6/findwords.R. Note seq_along(.)
- Our modified excerpt of H.Wickham's functional programming chapter.
- Week 4:
- - [continuing "functional programming" (week 3, above)]
- The initial R session, somewhat extended, of How R Searches and Finds Stuff
- Functions -> environments:
- ls(), get(), assign(), find(), ls.str(), new.env(), parent.env(),
globalenv(), emptyenv(), and the first two figures in How R Searches and Finds Stuff
- Week 5:
- - one_counter() example in "functional programming" (above) --- please study as homework, ask in class.
- "R is slow" etc:
-- "Premature optimization is the root of all evil", Donald Knuth
Rather: Test, test, and test again; using all.equal(target, current, tolerance ~= 10^-8) // Good R packages do -> sub directory './tests/'
-- typical issue about for() loop from Stackoverflow
--> functions system.time() and proc.time()
-- User Q about 'matrix vs. data.frame' on the R-help mailing list (March 2014). -- a first look at Bill Dunlap's solution + more, as Rmd script.
-- Our (modified) Rmd on "Performance" from Hadley Wickham's book chapter "Performance".
- Week 6:
- - Continuing "Performance" (see week 5): "Measure, don't guess" --> Using Rprof() and microbenchmark
- The 'matrix vs. data.frame' R-help example continued see above).
- R's byte compiler (-> require("compiler"); ?cmpfun), see in the above *.Rmd
- Performance <-> Copying of R objects: "Traching memory" memory-copying.R script.
- Start looking at R packages, source and "binary"; at first packages and namespace: env-namespace.R
- Week 7:
- - R packages, in source and "binary"; browsing Notes of "Package writing" course (Rnw and R files) (and the the "one" slide from week 2)
- package.skeleton()
- Packages and their Namespaces: Why are namespaces needed: Rmd whyNamespaces.Rmd and its html whyNamespaces.html.
- What happens when you call library(<pkg>) ? -- week7-pkg-namesp.Rmd
- Understand more of How R Searches and Finds Stuff: Script week7-namespace-pkg.R
- Extras:
- - a small script to get all methods of a generic function, nicely in a list, hidden or not.
- from lapply() to parLapply(): R's builtin package 'require(parallel)'
- What happens when you call library(<pkg>): week7-pkg-namesp.Rmd
-
Lecture attestation (Testat):
In order to obtain the ECTS credit you
have to pass the exam -- answering some questions, and writing R code - in a *.Rmd (R Markdown file) at the end of the teaching block,
specifically on April 15.
Recommended Reading
- Norman Matloff (2011) The Art of R Programming - A tour of statistical software design.
no starch press, San Francisco. on stock at Polybuchhandlung (CHF 42.-); see online for data, R code.
- Hadley Wickham (2013 ff) Advanced R, online
more advanced than our course; partly focused on his own packages
- Suraj Gupta (March 29, 2012) How R Searches and Finds Stuff, online;
Tough read, but helpful with its nice illustrations. Do consider Duncan Murdoch's note about it with minor caveats.
Miscellaneous on Programming (with R)
- "Literate Programming" by Donald Knuth
- "The Elements of Programming Style" by Kernighan and Plauger: Wikipedia, ,
Quotes