[R-sig-teaching] Online resources for teaching intro stats using R

Fri Oct 12 15:19:47 CEST 2007

Not being myself a statistician,  I often shy away of opposing the text 
book. However, I could not agree more with your statement. Thanks very 
much for spelling it out so clear.  I do believe that we can and should 
do something about it, we just need to think what is
the best mechanism. There was an experience in the GIS community, the 
Core Curriculum in GIScience (http://www.ncgia.ucsb.edu/giscc/), that 
was trying to achieve something similar to what you propose in that 
field but it is outdated now. There are some lesson there about what to 
do and not to do for the development of the class material...

Magdiel

Douglas Bates escribió:
> As a long-time user and developer of S then R, I am committed to
> having students, even introductory students, use R in my courses.  I
> primarily teach introductory statistics for engineering students I
> think that using R to remove the computational burden (except, of
> course, for the need to learn to use R to some extent) is a remarkable
> enhancement to any intro stats course.
> 
> I also see R as a way of removing some of the material in our courses
> that is no longer necessary.  I tell students that the textbook in the
> introductory engineering statistics course that I took in Spring
> semester of 1969 has essentially the same table of contents as many
> current texts for such a course, despite all the changes in computing
> technology.  In 1969 you could easily tell who the engineering
> students were because we all carried slide rules everywhere.  I don't
> see a lot of slide rules around campus today.
> 
> The inertia regarding topics in an intro course results in students
> still being taught
>  - the normal approximation to a binomial distribution
>  - the Poisson approximation to a binomial distribution
>  - the concept of "sampling with replacement" to motivate a binomial
> approximation to a hypergeometric distribution
>  - histograms but not empirical density plots
>  - hypothesis tests with a fixed level and a rejection region for the
> test statistic, rather than evaluation of a p-value
>  - "large-sample" z-tests or intervals versus "small-sample" t-tests
> or intervals
> We really do owe it to ourselves as a profession to ask ourselves why
> we continue to teach such topics.
> 
> If you stop and think for a moment, none of the approximations of
> distributions that we teach in intro courses are needed.  If you are
> modeling drawing a random sample from a population of size 18,000,000
> and a hypergeometric distribution is appropriate then you can and
> should use a hypergeometric.
> 
> Over the last several years I often found myself in the position of
> opposing the text book in my courses.  I would say that the text
> describes this awkward way of doing things (transform to a standard
> normal, juggle around the "less than" and "greater than" signs until
> you can evaluate a probability from a table in the text) but you, the
> student, should ignore that and do things the much easier way of
> simply evaluating the probability that you want to evaluate.  This is
> a burden on students.  Even though we would like to think of our
> brilliant lectures are the main font of wisdom for students in our
> courses, a substantial portion of them learn most of the material from
> the text. When the text and the instructor disagree, confusion ensues.
> 
> I have reached the point where I can tell in a few seconds if I want
> to consider a text.  If I open it up and see probability tables in an
> appendix I reject it.
> 
> One approach is to change to a text that does use R, such as the books
> by Peter Dalgaard or John Verzani or Michael Crawley.  In fact I am
> using Peter's book but it is difficult to use as a stand-alone text if
> one is also expected to cover some probability.
> 
> I am supplementing Peter's book with slides and other PDF documents
> created from Sweave sources.  As seems to happen in courses that I
> teach, these are "just in time" documents (and, on occasion, "just a
> little too late").
> 
> It is always difficult to create a textbook using a system like R
> because the software changes so rapidly relative to the time scale for
> writing and publishing texts.  A five-year old book like Peter's is a
> recent book.  A five-year-old version of R is ancient.  I'm back at my
> old tricks of disagreeing with the text even on Peter's book because I
> think that Peter does a wonderful job of explaining traditional
> graphics in R but students should forgo that and learn lattice right
> from the start.  In the five years since Peter's book was published
> (which means six or seven years since he wrote various sections of it)
> lattice has matured tremendously and is now documented in Paul
> Murrell's book on R Graphics and in Deepayan's forthcoming book on
> Lattice (which is "insanely great", by the way).
> 
> Many publishers (but, thankfully, not Peter's publisher) would like to
> be able to control all aspects of the course presentation.  It is
> natural to them that if there is to be electronic material, such as
> data sets and sample analyses, associated with the text then they
> should publish it as a CD-ROM to be included with the text.  We all
> know that doesn't work because the CD-ROM is going to be exactly as
> old as the text but the material on the CD-ROM should have changed
> much more rapidly.
> 
> I am becoming convinced that all the supplemental materials should be
> available on the web.  CRAN packages provide one very useful way of
> disseminating the supplemental material but other forms, perhaps a
> wiki, may also be useful.  There is a need for practice material and
> worked-out examples and reference sheets for basic definitions and
> facts about distributions, for example, that go beyond what would
> traditionally be part of a CRAN package.
> 
> Some materials on general topics such as various probability
> distributions or data graphics or classical tests in R could be useful
> without reference to a particular book.  I am thinking of the sort of
> "Schaum's Outline" supplement where basic properties and definitions
> are presented and a number of worked-out examples are given.  Another
> useful resource for intro teaching would be a repository of test
> questions although that may perhaps be too instructor-dependent.
> 
> I don't know the best way to collaborate on building such resources.
> I would consider something like a series of vignettes so the R code
> could be available as well as a printable file and probably the
> sources.  Others may find a wiki to be more natural although I am
> still trying to decide what a wiki is (remember - I was taking intro
> stats in 1969).  One possible collaborative mechanism it the
> R-forge.R-project.org site where we could start a project and
> contribute to it.  It has the advantage of a stable and relatively
> easy to remember URL for referring students.
> 
> I welcome private replies and comments or, preferably, a discussion on
> this list.
> 
> _______________________________________________
> R-sig-teaching at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
>