[R-sig-ME] Population weights in a Poisson model with overdispersion
Theodore Lytras
thlytras at gmail.com
Tue May 24 21:56:24 CEST 2016
Dear all,
I am having some trouble with the correct specification of a Poisson model
(with lme4), particularly with respect to weights.
Each observation is a rate from a single physician: number of patients with
ILI (influenza-like illness) / total patients. There are no fixed covariates;
I am only interested in estimating an average rate for all physicians, with an
intercept-only Poisson model. Observations are clustered within geographic
areas (8 in total), and have a fair amount of overdispersion, which I am
handling with an observation-level random effect.
Thus for one geographic area only (say "U1"), the appropriate model would be:
m.U1 <- glmer( gritot ~ 1 + (1|codeiat), offset=log(totvis),
data=datU1, family="poisson")
Where: gritot = number of patients with ILI, per physician
totvis = total patients per physician
codeiat = physician ID (observation ID), as an R factor
Now, for all geographic areas, I would guess the model should be:
m.all <- glmer( gritot ~ 1 + (1|codeiat) + (1|area), offset=log(totvis),
data=dat, family="poisson")
Where: area = area code, also as an R factor
Each codeiat only occurs within a given area, so if I understand correctly my
grouping factors are nested (codeiat nested within area). Note that I am also
interested in estimating the average rate for each area.
Question 1: Unfortunately in the above model, the variance for area is very
very small compared to the variance for codeiat. From the output of
summary(m.all):
Random effects:
Groups Name Variance Std.Dev.
codeiat (Intercept) 7.690e-01 8.769e-01
area (Intercept) 1.557e-10 1.248e-05
Number of obs: 108, groups: codeiat, 108; area, 8
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.16856 0.09294 -23.33 <2e-16 ***
Is there some way (i.e. some different random-effects specification) to
prevent the observation-level random effect from "eating away" all the
variance?
Background theory suggests there is a single average rate (the fixed
intercept), around which the rate per area varies, and there is a further
variance per physician (codeiat) around the area-specific rate. Assigning all
the variance to the physician and none to the geographic area makes no
theoretical or practical sense.
For example, if I drop the random effect per physician (i.e. the observation
level random effect):
m2 <- glmer( gritot ~ 1 + (1|area), offset=log(totvis),
data=dat, family="poisson")
the output becomes:
Random effects:
Groups Name Variance Std.Dev.
area (Intercept) 0.05958 0.2441
Number of obs: 108, groups: area, 8
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.95867 0.09237 -21.2 <2e-16 ***
This makes sense, but then how would I take account of the overdispersion
**within** each geographic area?? Shouldn't I still need to specify an
observation-level random effect somehow?? (One that does not "eat away" all
the variance.)
Question 2: The 8 geographic areas have different population sizes, and I
would like my fixed intercept (the average rate across the areas) to reflect
this as well, i.e. observations from areas with big populations should weigh
more than observations from areas with smaller populations. How would I assign
weights to each observation, in order to get a population-weighted average
rate as the fixed intercept of the model (also taking account of the
overdispersion problem above)?? I am really really stuck on this one.
My initial thought would be to fit separate models for each area (with
observation-level random-effect) and then combine the log rates (i.e. the
fixed intercepts) in some second-level model (maybe Poisson with identity
link??). But I have no idea how to combine the standard error of each rate and
its population-specific weight (as offset? as weight?). And intuition tells me
this should be possible to do within a single model.
I really appreciate any help!
Thank you so much,
Theodore Lytras
Epidemiologist, Department of Epidemiologic Surveillance,
Hellenic Centre of Disease Control and Prevention, Athens, Greece
More information about the R-sig-mixed-models
mailing list