[R] Estimating group modes using GLMs for skewed distributions

Daniel Meddings dpmeddings at gmail.com
Tue Aug 25 09:44:58 CEST 2015


I am wondering why for generalized linear models with Gamma, Poisson and
Negative Binomial distributions that there appears to be no discussion
about estimating the medians or the modes of the distributions. For example
in clinical trials for count data where a log link is used it is the
quantity

E[Y|T] / E[Y|C] =   exp( beta_T + beta^{-}x^{*} )  / exp(beta_C +
beta^{-}x^{*})
                       =   exp( beta_T  )  / exp(beta_C )

that seems to be of interest, where beta_T, and beta_C are the effects of
treatment and control respectively, x^{*} is the chosen covariate point to
estimate the ratio at (doesn't matter what this is here since they cancel),
and beta^{-} is the model parameters excluding the treatment and control
effects.

Whilst I have no objection to this ratio, in addition I would also wish to
know what the mode or the median of the treated and control group is (and
the difference in these quantities), given that these distributions are
skewed (i.e. the mean is not too relevant).

For example for a skewed continuous variable modeled with the gamma
distribution if $alpha$ is the shape parameter then the mode for treated
subjects at x^{*} is given as follows

mode(Y|T) = ((alpha-1)(alpha))* exp(beta_T+beta^{-}x^{*})

as long as alpha >= 1. However I see no mention of this kind of summary
being estimated in these GLMs and I am wondering why. Is it perhaps that
the ratio of means is more difficult to affect by small treatment effects
than is a difference in modes or medians - i.e. analogous to risk ratios
generally being preferred to risk differences when comparing disease
incidence rates?

The reason I am interested in estimating modes or medians is that I wish to
compare how well a linear mixed model performs (which assumes normally
distributed responses) at estimating the mode or median by using the
standard mixed model estimates of the group means when the distribution of
Y is skewed. However perhaps I should be looking at how well the mixed
model estimates the ratio of means?

For comparison I have implemented the above estimation of the treatment and
control group modes using GLMs with random effects (the formula is similar
to the above but with simple functions of the random effects covariance
parameters multiplying the expression). As expected estimates of the group
means from the mixed model agree well with the estimates of the modes from
the GLM for reasonably symmetrical distributions, but the mixed model's
mean estimates start to increase beyond the modes as the distribution
becomes skewed.

I can do inference on the difference in the modes using a parametric
bootstrap, so as far as I am concerned I cannot see any problems with this
approach. However if there are some I would welcome somebody pointing these
out.

Many thanks

Dan

	[[alternative HTML version deleted]]



More information about the R-help mailing list