family {stats} | R Documentation |

Family objects provide a convenient way to specify the details of the
models used by functions such as `glm`

. See the
documentation for `glm`

for the details on how such model
fitting takes place.

family(object, ...) binomial(link = "logit") gaussian(link = "identity") Gamma(link = "inverse") inverse.gaussian(link = "1/mu^2") poisson(link = "log") quasi(link = "identity", variance = "constant") quasibinomial(link = "logit") quasipoisson(link = "log")

`link` |
a specification for the model link function. This can be
a name/expression, a literal character string, a length-one character
vector or an object of class
The The |

`variance` |
for all families other than |

`object` |
the function |

`...` |
further arguments passed to methods. |

`family`

is a generic function with methods for classes
`"glm"`

and `"lm"`

(the latter returning `gaussian()`

).

For the `binomial`

and `quasibinomial`

families the response
can be specified in one of three ways:

As a factor: ‘success’ is interpreted as the factor not having the first level (and hence usually of having the second level).

As a numerical vector with values between

`0`

and`1`

, interpreted as the proportion of successful cases (with the total number of cases given by the`weights`

).As a two-column integer matrix: the first column gives the number of successes and the second the number of failures.

The `quasibinomial`

and `quasipoisson`

families differ from
the `binomial`

and `poisson`

families only in that the
dispersion parameter is not fixed at one, so they can model
over-dispersion. For the binomial case see McCullagh and Nelder
(1989, pp. 124–8). Although they show that there is (under some
restrictions) a model with
variance proportional to mean as in the quasi-binomial model, note
that `glm`

does not compute maximum-likelihood estimates in that
model. The behaviour of S is closer to the quasi- variants.

An object of class `"family"`

(which has a concise print method).
This is a list with elements

`family` |
character: the family name. |

`link` |
character: the link name. |

`linkfun` |
function: the link. |

`linkinv` |
function: the inverse of the link function. |

`variance` |
function: the variance as a function of the mean. |

`dev.resids` |
function giving the deviance residuals as a function
of |

`aic` |
function giving the AIC value if appropriate (but |

`mu.eta` |
function: derivative |

`initialize` |
expression. This needs to set up whatever data
objects are needed for the family as well as |

`validmu` |
logical function. Returns |

`valideta` |
logical function. Returns |

`simulate` |
(optional) function |

The `link`

and `variance`

arguments have rather awkward
semantics for back-compatibility. The recommended way is to supply
them is as quoted character strings, but they can also be supplied
unquoted (as names or expressions). In addition, they can also be
supplied as a length-one character vector giving the name of one of
the options, or as a list (for `link`

, of class
`"link-glm"`

). The restrictions apply only to links given as
names: when given as a character string all the links known to
`make.link`

are accepted.

This is potentially ambiguous: supplying `link = logit`

could mean
the unquoted name of a link or the value of object `logit`

. It
is interpreted if possible as the name of an allowed link, then
as an object. (You can force the interpretation to always be the value of
an object via `logit[1]`

.)

The design was inspired by S functions of the same names described
in Hastie & Pregibon (1992) (except `quasibinomial`

and
`quasipoisson`

).

McCullagh P. and Nelder, J. A. (1989)
*Generalized Linear Models.*
London: Chapman and Hall.

Dobson, A. J. (1983)
*An Introduction to Statistical Modelling.*
London: Chapman and Hall.

Cox, D. R. and Snell, E. J. (1981).
*Applied Statistics; Principles and Examples.*
London: Chapman and Hall.

Hastie, T. J. and Pregibon, D. (1992)
*Generalized linear models.*
Chapter 6 of *Statistical Models in S*
eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.

For binomial *coefficients*, `choose`

;
the binomial and negative binomial *distributions*,
`Binomial`

, and `NegBinomial`

.

require(utils) # for str nf <- gaussian() # Normal family nf str(nf) gf <- Gamma() gf str(gf) gf$linkinv gf$variance(-3:4) #- == (.)^2 ## quasipoisson. compare with example(glm) counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) d.AD <- data.frame(treatment, outcome, counts) glm.qD93 <- glm(counts ~ outcome + treatment, family = quasipoisson()) glm.qD93 anova(glm.qD93, test = "F") summary(glm.qD93) ## for Poisson results use anova(glm.qD93, dispersion = 1, test = "Chisq") summary(glm.qD93, dispersion = 1) ## Example of user-specified link, a logit model for p^days ## See Shaffer, T. 2004. Auk 121(2): 526-540. logexp <- function(days = 1) { linkfun <- function(mu) qlogis(mu^(1/days)) linkinv <- function(eta) plogis(eta)^days mu.eta <- function(eta) days * plogis(eta)^(days-1) * binomial()$mu_eta valideta <- function(eta) TRUE link <- paste0("logexp(", days, ")") structure(list(linkfun = linkfun, linkinv = linkinv, mu.eta = mu.eta, valideta = valideta, name = link), class = "link-glm") } binomial(logexp(3)) ## in practice this would be used with a vector of 'days', in ## which case use an offset of 0 in the corresponding formula ## to get the null deviance right. ## Binomial with identity link: often not a good idea. ## Not run: binomial(link = make.link("identity")) ## tests of quasi x <- rnorm(100) y <- rpois(100, exp(1+x)) glm(y ~ x, family = quasi(variance = "mu", link = "log")) # which is the same as glm(y ~ x, family = poisson) glm(y ~ x, family = quasi(variance = "mu^2", link = "log")) ## Not run: glm(y ~ x, family = quasi(variance = "mu^3", link = "log")) # fails y <- rbinom(100, 1, plogis(x)) # needs to set a starting value for the next fit glm(y ~ x, family = quasi(variance = "mu(1-mu)", link = "logit"), start = c(0,1))

[Package *stats* version 3.3.0 Index]