[R-sig-teaching] t.test from summary of data

Tue Nov 9 15:17:51 CET 2010

On Tue, Nov 9, 2010 at 6:43 AM, Robert W. Hayden <hayden at mv.mv.com> wrote:
> Forwarded message:
>> From: "Byungchul Cha" <cha at muhlenberg.edu>
>
>> We use "Intro to the practice of statistics" by Moore, McCabe, Craig
>  as textbook. Is there an option for the command "t.test" so that it
>  can perform t-test from the summary of sample data (that is, sample
>  mean, sample sd, and sample size), instead of the (raw) sample data?
>  I've been looking to find a way to do so, but, I couldn't.  I
>  believe it is possible to write codes to do this for my own use,
>  but, I am just curious. Is there a pedagogical reason not to do so?
>
> YES!
>
> One of the reasons to use statistical software in a first course is so
> that students can work with real, raw data.  For example, it is nearly
> imnpossible to check assumptions if you do not have the data.
> However, publishers want to sell books to EVERYONE, including those
> who do not have or do not use any technology beyond a $2 calculator.
> So the homework problems are driven by this least common denominator.
>
> These data-free homework problems provide practice in plugging numbers
> into formulae, an Algebra I or even junior high school skill, and a
> pretty minor skill for doing statistics in the real world.  So I would
> suggest not assigning these and replacing them with problems that
> involve real data, and asking students to LOOK AT THE DATA to check
> assumptions and look for gross errors. I think you can find many real
> data sets on the CD that comes with this text, and many are included
> with R as well. Other sources include DASL and the _Journal of
> Statitics Education_.

I completely agree that the practice of teaching statistics as a
collection of plug-in formulas, without any indication that you might,
say, plot the data to see if it close to the form assumed in the
formula, is quite outmoded.  I took a first course in statistics over
forty years ago when our calculation capabilities were limited to
pencil and paper or slide rules.  Naturally we needed to use any
computational short cut we could find and we wouldn't be able to
calculate quantities related to distributions without access to
tables.   This led to many approximations (normal or Poisson
approximation to the binomial, binomial instead of hypergeometric,
normal approximation to Poisson, normal approximation to t when
degrees of freedom exceeded 30, etc.)

Things have advanced a lot since then.  Lectures can be presented
using slides on a laptop hooked to a projector and the students often
follow along with copies of the slides on their own laptops or perhaps
on tablet computers or even smart phones.  The fact that we teach the
same approach of plugging numbers into simplistic formulas and using
approximations so that students can get the answers by looking up
numbers in tables is absurd, yet we do so.  One reason is inertia.  If
my introductory course in statistics was taught this way then I should
teach all subsequent introductory courses  this way.  Inertial is
encouraged by the textbook industry who concentrate on providing new
editions of existing texts, not because the content is new or
providing greater insight into data but because the problems have
different numbers and students must buy new books instead of used
books.

Also, statisticians should recognize that the majority of introductory
statistics courses are taught in mathematics departments at two-year
or four-year colleges, which means that the instructor is often a
person whose training is in some area of mathematics not related to
statistics and who was unlucky enough to draw the short straw.  Not
surprisingly such an instructor feels more comfortable teach formula
derivations than teaching exploration of data, assessment of
assumptions, and how to use software to do so.

Saying we can't teach data exploration, etc. because students have
limited computational capabilities is an argument that gets weaker and
weaker every year.  The cost of inexpensive computers like netbooks is
approaching the cost of one or two textbooks, not to mention other
computing devices like smart phones.  I think it is just too easy to
keep doing what we have been doing.  It's difficult to write about or
to teach how to use software because the software keeps changing.  It
is much easier to present the same old formulas with a couple of new,
usually artificial, examples thrown in to make things look new.

> ------->  First-time AP Stats. teacher?  Help is on the way! See
>
> http://courses.ncssm.edu/math/Stat_Inst/Stats2007/Bob%20Hayden/Relief.html
>
> Robert W. Hayden, P.O Box 450, North Troy, VT 05859
> phone (802) 988-2587  web site http://statland.org/
> email  bob statland.org  (add your own "@" and save me some spam)
>
> _______________________________________________
> R-sig-teaching at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
>