family {stats} | R Documentation |

Family objects provide a convenient way to specify the details of the
models used by functions such as `glm`

. See the
documentation for `glm`

for the details on how such model
fitting takes place.

family(object, ...) binomial(link = "logit") gaussian(link = "identity") Gamma(link = "inverse") inverse.gaussian(link = "1/mu^2") poisson(link = "log") quasi(link = "identity", variance = "constant") quasibinomial(link = "logit") quasipoisson(link = "log")

`link` |
a specification for the model link function. This can be
a name/expression, a literal character string, a length-one character
vector, or an object of class
The The |

`variance` |
for all families other than |

`object` |
the function |

`...` |
further arguments passed to methods. |

`family`

is a generic function with methods for classes
`"glm"`

and `"lm"`

(the latter returning `gaussian()`

).

For the `binomial`

and `quasibinomial`

families the response
can be specified in one of three ways:

As a factor: ‘success’ is interpreted as the factor not having the first level (and hence usually of having the second level).

As a numerical vector with values between

`0`

and`1`

, interpreted as the proportion of successful cases (with the total number of cases given by the`weights`

).As a two-column integer matrix: the first column gives the number of successes and the second the number of failures.

The `quasibinomial`

and `quasipoisson`

families differ from
the `binomial`

and `poisson`

families only in that the
dispersion parameter is not fixed at one, so they can model
over-dispersion. For the binomial case see McCullagh and Nelder
(1989, pp. 124–8). Although they show that there is (under some
restrictions) a model with
variance proportional to mean as in the quasi-binomial model, note
that `glm`

does not compute maximum-likelihood estimates in that
model. The behaviour of S is closer to the quasi- variants.

An object of class `"family"`

(which has a concise print method).
This is a list with elements

`family` |
character: the family name. |

`link` |
character: the link name. |

`linkfun` |
function: the link. |

`linkinv` |
function: the inverse of the link function. |

`variance` |
function: the variance as a function of the mean. |

`dev.resids` |
function giving the deviance for each observation
as a function of |

`aic` |
function giving the AIC value if appropriate (but |

`mu.eta` |
function: derivative of the inverse-link function
with respect to the linear predictor. If the inverse-link
function is |

`initialize` |
expression. This needs to set up whatever data
objects are needed for the family as well as |

`validmu` |
logical function. Returns |

`valideta` |
logical function. Returns |

`simulate` |
(optional) function |

The `link`

and `variance`

arguments have rather awkward
semantics for back-compatibility. The recommended way is to supply
them as quoted character strings, but they can also be supplied
unquoted (as names or expressions). Additionally, they can be
supplied as a length-one character vector giving the name of one of
the options, or as a list (for `link`

, of class
`"link-glm"`

). The restrictions apply only to links given as
names: when given as a character string all the links known to
`make.link`

are accepted.

This is potentially ambiguous: supplying `link = logit`

could mean
the unquoted name of a link or the value of object `logit`

. It
is interpreted if possible as the name of an allowed link, then
as an object. (You can force the interpretation to always be the value of
an object via `logit[1]`

.)

The design was inspired by S functions of the same names described
in Hastie & Pregibon (1992) (except `quasibinomial`

and
`quasipoisson`

).

McCullagh P. and Nelder, J. A. (1989)
*Generalized Linear Models.*
London: Chapman and Hall.

Dobson, A. J. (1983)
*An Introduction to Statistical Modelling.*
London: Chapman and Hall.

Cox, D. R. and Snell, E. J. (1981).
*Applied Statistics; Principles and Examples.*
London: Chapman and Hall.

Hastie, T. J. and Pregibon, D. (1992)
*Generalized linear models.*
Chapter 6 of *Statistical Models in S*
eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

For binomial *coefficients*, `choose`

;
the binomial and negative binomial *distributions*,
`Binomial`

, and `NegBinomial`

.

require(utils) # for str nf <- gaussian() # Normal family nf str(nf) gf <- Gamma() gf str(gf) gf$linkinv gf$variance(-3:4) #- == (.)^2 ## Binomial with default 'logit' link: Check some properties visually: bi <- binomial() et <- seq(-10,10, by=1/8) plot(et, bi$mu.eta(et), type="l") ## show that mu.eta() is derivative of linkinv() : lines((et[-1]+et[-length(et)])/2, col=adjustcolor("red", 1/4), diff(bi$linkinv(et))/diff(et), type="l", lwd=4) ## which here is the logistic density: lines(et, dlogis(et), lwd=3, col=adjustcolor("blue", 1/4)) stopifnot(exprs = { all.equal(bi$ mu.eta(et), dlogis(et)) all.equal(bi$linkinv(et), plogis(et) -> m) all.equal(bi$linkfun(m ), qlogis(m)) # logit(.) == qlogis(.) ! }) ## Data from example(glm) : d.AD <- data.frame(treatment = gl(3,3), outcome = gl(3,1,9), counts = c(18,17,15, 20,10,20, 25,13,12)) glm.D93 <- glm(counts ~ outcome + treatment, d.AD, family = poisson()) ## Quasipoisson: compare with above / example(glm) : glm.qD93 <- glm(counts ~ outcome + treatment, d.AD, family = quasipoisson()) glm.qD93 anova (glm.qD93, test = "F") summary(glm.qD93) ## for Poisson results (same as from 'glm.D93' !) use anova (glm.qD93, dispersion = 1, test = "Chisq") summary(glm.qD93, dispersion = 1) ## Example of user-specified link, a logit model for p^days ## See Shaffer, T. 2004. Auk 121(2): 526-540. logexp <- function(days = 1) { linkfun <- function(mu) qlogis(mu^(1/days)) linkinv <- function(eta) plogis(eta)^days mu.eta <- function(eta) days * plogis(eta)^(days-1) * binomial()$mu.eta(eta) valideta <- function(eta) TRUE link <- paste0("logexp(", days, ")") structure(list(linkfun = linkfun, linkinv = linkinv, mu.eta = mu.eta, valideta = valideta, name = link), class = "link-glm") } (bil3 <- binomial(logexp(3))) ## in practice this would be used with a vector of 'days', in ## which case use an offset of 0 in the corresponding formula ## to get the null deviance right. ## Binomial with identity link: often not a good idea, as both ## computationally and conceptually difficult: binomial(link = "identity") ## is exactly the same as binomial(link = make.link("identity")) ## tests of quasi x <- rnorm(100) y <- rpois(100, exp(1+x)) glm(y ~ x, family = quasi(variance = "mu", link = "log")) # which is the same as glm(y ~ x, family = poisson) glm(y ~ x, family = quasi(variance = "mu^2", link = "log")) ## Not run: glm(y ~ x, family = quasi(variance = "mu^3", link = "log")) # fails y <- rbinom(100, 1, plogis(x)) # need to set a starting value for the next fit glm(y ~ x, family = quasi(variance = "mu(1-mu)", link = "logit"), start = c(0,1))

[Package *stats* version 3.6.0 Index]