[R] logit and polytomous data
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri Mar 10 13:15:42 CET 2000
> Date: Fri, 10 Mar 2000 12:18:14 +0200 (EET)
> From: Kari Ruohonen <toimelat at saunalahti.fi>
>
> I am new to generalized linear models and studying
> McCullagh & Nelder (1989). Especially, I have a problem
> resembling the \"cheese taste\" example (5.3.1. p. 109) of
> the book.
You confused me. It is on page 109 of McCullagh & Nelder (1983), the
first edition. That example is not using a glm: see later.
> I tried to analyse the cheese example with R but
> failed to do so because R allowed me to use logit link
> function only with binary family that supposes 0 <= y <= 1.
> Do I need to scale the y\'s or is there another way?
You want the *binomial* family. From Venables & Ripley
(1999, pp. 218-9)
1. If the response is a numeric vector it is assumed to hold
the data in ratio form, $y_i = s_i/a_i$, in which case the
$a_i$s must be given as a vector of weights using the
\sfn{weights} argument. (If the $a_i$ are all one the default
\sfn{weights} suffices.)
2. If the response is a logical vector or a two-level factor
it is treated as a 0/1 numeric vector and handled as previously.
If the response is a multi-level factor, the first level is treated
as 0 (failure) and all others as 1 (success).
3. If the response is a two-column matrix it is assumed that the
first column holds the number of successes, $s_i$, and the second
holds the number of failures, $a_i - s_i$, for each trial. In
this case no \sfn{weights} argument is required.
The less-intuitive third form allows the fitting function to select a
better starting value, so we tend to favour it.
This is using the standard notation for a GLM family: for a binomial
s_i is the number of successes out of a_i trial.
HOWEVER, the analysis in McCullagh and Nelder (1983) is via an ordinal
logistic regression.
Here's an R analysis:
library(MASS)
counts <- scan()
0 0 1 7 8 8 19 8 1
6 9 12 11 7 6 1 0 0
1 1 6 8 23 7 5 1 0
0 0 0 1 3 7 14 16 11
resp <- ordered(gl(9, 1, 36))
cheese <- gl(4, 9, 36, labels=LETTERS[1:4])
cheese <- relevel(cheese, "D")
fit <- polr(resp ~ cheese, weights=counts, Hess=TRUE)
fit
Call:
polr(formula = resp ~ cheese, weights = counts, Hess=TRUE)
Coefficients:
cheeseA cheeseB cheeseC
-1.612792 -4.964633 -3.322695
Intercepts:
1|2 2|3 3|4 4|5 5|6 6|7 7|8
-7.0801140 -6.0249573 -4.9253989 -3.8568048 -2.5205479 -1.5685582 -0.0669085
8|9
1.4929198
Residual Deviance: 711.35
AIC: 733.35
summary(fit)
...
Coefficients:
Value Std. Error t value
cheeseA -1.612792 0.3805424 -4.238139
cheeseB -4.964633 0.4767185 -10.414182
cheeseC -3.322695 0.4218289 -7.876879
...
vcov(fit)
...
(Summary on polr should be able to give the correlations, but the presnet
version has a small bug.)
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UK Fax: +44 1865 272595
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list