[R] [OT]Example Data for Non-statisticians

Frank E Harrell Jr f.harrell at vanderbilt.edu
Thu Sep 2 14:01:27 CEST 2004


Kevin Wang wrote:
> (Sorry for the slightly off topic post)
> 
> I'm giving a talk (on data mining) to some non-statisticians (who're
> all postgrad students, but a mixture of Science and Commerce majors).
> 
> My intention is to show them the importance of statistics when doing
> data mining.  What I'm thinking of doing is using, hopefully, two
> datasets.  One from scientific area and another that is
> commercially-related.  However, it would be nice if the datasets (or
> at least one of them) will violate some kind of basic statistical
> assumptions (in its raw form anyway) -- hence showing having a basic
> statistical knowledge is important.  Also hopefully, I can introduce R
> to them (since many of them haven't heard of it yet).
> 
> Does anyone have (or know where I can get) such data?  It doesn't have
> to be huge,.....
> 
> Thanks!
> 
> Kevin
> 
The titanic3 dataset on our web site - issue 
loadUrl('http://biostat.mc.vanderbilt.edu/twiki/pub/Main/DataSets/titanic3.sav') 
to load( ) it - may fit the bill although the response variable is 
binary.  Assumptions that would be violated in a trivial analysis would 
be additivity of age and passenger class, and perhaps linearity of age. 
  At least it is a dataset that everyone understands already.

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list