[R] basic question predict GLM offset

peter dalgaard pdalgd at gmail.com
Sun Apr 15 09:30:04 CEST 2012


On Apr 15, 2012, at 01:37 , David Winsemius wrote:

> 
> On Apr 14, 2012, at 6:47 PM, smfa wrote:
> 
>> Hi,
>> 
>> I know this is probably a basic question... But I don't seem to find the
>> answer.
>> 
>> I'm fitting a GLM with a Poisson family, and then tried to get a look at the
>> predictions, however the offset does seem to be taken into consideration:
>> 
>> model_glm=glm(cases~rhs(data$year,2003)+lhs(data$year,2003),
>> offset=(log(population)), data=data, subset=28:36, family=poisson())
>> 
>> predict (model_glm, type="response")
>> 
>> I get cases not rates...
>> 
>> I've tried also
>> 
>> model_glm=glm(cases~rhs(data$year,2003)+lhs(data$year,2003)+
>> offset(log(population)), data=data, subset=28:36, family=poisson())
>> 
>> with the same results. However when I predict from GAM, using mgcv, the
>> predictions consider the offset (I get rates).
> 
> The beta coefficients are the log-rate-estimates when you use log(population) as the offset.

But they are not the log predicted rates if you are describing many rates using a few parameters.

> 
>> I'm missing something?
> 
> You are most definitely missing the part where you include 'data'.

True. (cases ~ rhs(year, 2003) + lhs(year, 2003) is right, the other way only even works if you predict on the same data set).

More to the point: does the OP realize how easy it is to go from fitted cases to rates by dividing with the population size? 

A logical way to get predicted rates would be to make predictions for a new data set where the poulation size was set to 1 (or 100000, maybe), but it seems easier to "divide and conquer" (pardon the pun). 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list