This collection contains datasets used in the book Pattern Recognition and Neural Networks by B.D. Ripley (1996) Cambridge University Press ISBN 0 521 46986 7 (hardback, 416 pages, 29.95 pounds, US$49.95)
The background to the datasets is described in section 1.4; this file relates the computer-readable files to that description.
Data from Aitchison & Dunsmore (1975, Tables 11.1-3).
Data file Cushings.dat has four columns,Data from Ripley (1994a).
This has two real-valued co-ordinates (xs and ys) and a class (xc)
which is 0 or 1.
This is a dataset on 61 viruses with rod-shaped particles affecting
various crops (tobacco, tomato, cucumber and others) described by
{Fauquet et al. (1988) and analysed by Eslava-G\'omez (1989). There
are 18 measurements on each virus, the number of amino acid residues
per molecule of coat protein. Data from Campbell & Mahon (1974) on the morphology of rock crabs of
genus Leptograpsus.
There are 50 specimens of each sex of each of two colour forms. This example comes from forensic testing of glass collected by
B. German on 214 fragments of glass. It is also contained in the
UCI machine-learning database collection (Murphy & Aha, 1995). A population of women who were at least 21 years old, of Pima Indian heritage
and living near Phoenix, Arizona, was tested for diabetes
according to World Health Organization criteria. The data
were collected by the US National Institute of Diabetes and Digestive and
Kidney Diseases (Smith et al, 1988). This example is also contained in the
UCI machine-learning database collection (Murphy & Aha, 1995).
The whole dataset is in order Hordeviruses (3), Tobraviruses (6),
Tobamoviruses (39) and `furoviruses' (13).
Leptograpsus crabs
Forensic glass
The type codes are:
The ten groups used for the cross-validation experiments (I believe)
are listed as row numbers in the file fglass.grp
Diabetes in Pima Indians
Last edited on
Tues Nov 7 1995
by Brian Ripley
(ripley@stats.ox.ac.uk)