# [R] FW: Levels and GLM

Kuhn, Max Max.Kuhn at pfizer.com
Fri Jul 7 22:17:06 CEST 2006

```
One correction... since you are fitting a logistic model, it is
technically correct to say the "mean value of the linear predictor,"

20 lashes for me.

Max

-----Original Message-----
From: Kuhn, Max
Sent: Friday, July 07, 2006 4:11 PM
To: 'r-help at stat.math.ethz.ch'
Subject: [R] Levels and GLM

jdrapp,

By default, R fits full rank models. If you are coming from SAS, you're
probably used to less than full rank model parameterizations.

>From Section 11.1.1 of "An Introduction to R" at

http://cran.r-project.org/doc/manuals/R-intro.html#Contrasts

there is this:

"What about a k-level factor A? The answer differs for unordered and
ordered factors. For unordered factors k - 1 columns are generated
for the indicators of the second, ..., kth levels of the factor.
(Thus the implicit parameterization is to contrast the response at
each level with that at the first.)"

So level "M" is the "reference cell". Assuming that
data.logistic\$Overall is continuous, the intercept is the estimate of
the mean response when maj = "M" and data.logistic\$Overall = 0. The
estimate for majN is the difference between the reference cell
(estimated
by the intercept) and the mean response when maj = "N" and
data.logistic\$Overall = 0.

You should check out ?model.matrix and ?contrasts.

Max

> I am using the as.factor command to use with glm.  When I use the
command
>
> >maj <- as.factor(data.logistic\$Majors)
> >maj
>
> I receive the following output:
>   [1] M M N M M M M N N M M M N M M M M M M M M M M M N M N N M M N M
> M N M M M M M
>  [40] N M N M M N M M M N M N M N M N N N M N M M M M M M N M N M M M
> M M N N M M M
>  [79] M M M N N M M N M N M M M M M M M M M M M M M M M N M M M M M N
> M M M M M N M
> [118] M M M N M N N M M M M M M M M N M N M M M M M N M M M M N M M M
> N N M M M N M
> [157] M M M M M M M M M M M M M N M M N N M M N M M M M M M M M M M M
> M M N M N M M
> [196] M N M M M M M M M M N M M M M M M M M N M M M M M M M M M M M M
> M M N M M N N
> [235] M M M M M N M M M M M M N N M M N M M M M M M M M M M M M M M M
> M N M M M M N
> [274] N M M M M M M N M M M M M M M M M M N N M N M M M M M M M M M M
> N M N N M M M
> [313] M M M M M M M N M M M M M N M M M M M M M M M M M M M M M N M M
> M M M M M N M
> [352] M N M N M M N M M M M N M M M M M M M M M M N M M N N
> Levels: M N
>
> When I enter:
>
> > logistic.glm <- glm(data.logistic\$X100.Yard.Average ~
data.logistic\$Overall + maj, family=binomial)
> > logistic.glm
>
> I receive the following output:
>
> Call:  glm(formula = data.logistic\$X100.Yard.Average ~
> data.logistic\$Overall +      maj, family = binomial)
>
> Coefficients:
>           (Intercept)  data.logistic\$Overall                   majN
>               2.38819               -0.02718               -0.18385
>
> Degrees of Freedom: 377 Total (i.e. Null);  375 Residual
> Null Deviance:	    514.5
> Residual Deviance: 410.7 	AIC: 416.7
>
> My question:  Why is there no output for majM?  Any help would be
> greatly appreciated
----------------------------------------------------------------------
LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}

```