[R-sig-ME] Population weights in a Poisson model with overdispersion

Theodore Lytras thlytras at gmail.com
Tue May 24 21:56:24 CEST 2016


Dear all,

I am having some trouble with the correct specification of a Poisson model 
(with lme4), particularly with respect to weights. 

Each observation is a rate from a single physician: number of patients with 
ILI (influenza-like illness) / total patients. There are no fixed covariates; 
I am only interested in estimating an average rate for all physicians, with an 
intercept-only Poisson model. Observations are clustered within geographic 
areas (8 in total), and have a fair amount of overdispersion, which I am 
handling with an observation-level random effect.

Thus for one geographic area only (say "U1"), the appropriate model would be:

m.U1 <- glmer( gritot ~ 1 + (1|codeiat), offset=log(totvis), 
				data=datU1, family="poisson")

Where: gritot = number of patients with ILI, per physician
totvis = total patients per physician
codeiat = physician ID (observation ID), as an R factor

Now, for all geographic areas, I would guess the model should be:

m.all <- glmer( gritot ~ 1 + (1|codeiat) + (1|area), offset=log(totvis),
				data=dat, family="poisson")

Where: area = area code, also as an R factor

Each codeiat only occurs within a given area, so if I understand correctly my 
grouping factors are nested (codeiat nested within area). Note that I am also 
interested in estimating the average rate for each area.


Question 1: Unfortunately in the above model, the variance for area is very 
very small compared to the variance for codeiat. From the output of 
summary(m.all):

Random effects:
 Groups  Name        Variance  Std.Dev. 
 codeiat (Intercept) 7.690e-01 8.769e-01
 area    (Intercept) 1.557e-10 1.248e-05
Number of obs: 108, groups:  codeiat, 108; area, 8

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) -2.16856    0.09294  -23.33   <2e-16 ***

Is there some way (i.e. some different random-effects specification) to 
prevent the observation-level random effect from "eating away" all the 
variance? 
Background theory suggests there is a single average rate (the fixed 
intercept), around which the rate per area varies, and there is a further 
variance per physician (codeiat) around the area-specific rate. Assigning all 
the variance to the physician and none to the geographic area makes no 
theoretical or practical sense.

For example, if I drop the random effect per physician (i.e. the observation 
level random effect):

m2 <- glmer( gritot ~ 1 + (1|area), offset=log(totvis), 
				data=dat, family="poisson")

the output becomes:

Random effects:
 Groups Name        Variance Std.Dev.
 area   (Intercept) 0.05958  0.2441  
Number of obs: 108, groups:  area, 8

Fixed effects:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept) -1.95867    0.09237   -21.2   <2e-16 ***

This makes sense, but then how would I take account of the overdispersion 
**within** each geographic area?? Shouldn't I still need to specify an 
observation-level random effect somehow?? (One that does not "eat away" all 
the variance.)


Question 2: The 8 geographic areas have different population sizes, and I 
would like my fixed intercept (the average rate across the areas) to reflect 
this as well, i.e. observations from areas with big populations should weigh 
more than observations from areas with smaller populations. How would I assign 
weights to each observation, in order to get a population-weighted average 
rate as the fixed intercept of the model (also taking account of the 
overdispersion problem above)?? I am really really stuck on this one.

My initial thought would be to fit separate models for each area (with 
observation-level random-effect) and then combine the log rates (i.e. the 
fixed intercepts) in some second-level model (maybe Poisson with identity 
link??). But I have no idea how to combine the standard error of each rate and 
its population-specific weight (as offset? as weight?). And intuition tells me 
this should be possible to do within a single model.


I really appreciate any help!

Thank you so much,

Theodore Lytras

Epidemiologist, Department of Epidemiologic Surveillance,
Hellenic Centre of Disease Control and Prevention, Athens, Greece



More information about the R-sig-mixed-models mailing list