[R] Comparing output from linear regression to output from quasipoisson to determine the model that fits best.

Uwe Ligges ligges at statistik.tu-dortmund.de
Tue Dec 2 09:58:32 CET 2008



John Sorkin wrote:
> R 2.7
> Windows XP
> 
> I have two model that have been run using exactly the same data, both fit using glm(). One model is a linear regression (gaussian(link = "identity"))  the other a quasipoisson(link = "log"). I have log likelihoods from each model. Is there any way I can determine which model is a better fit to the data? anova() does not appear to work as the models have the same residual degrees of freedom:


Since the class of the models is quite different, I'd go on by looking 
carefully at the residuals.

Uwe Ligges


> fit1<-glm(PHYSFUNC~HIV,data=KA)
> summary(fit1)
> 
> fitQP<-glm(PHYSFUNC~HIV,data=KA,family=quasipoisson)
> summary(fitQP)
> 
> anova(fit1,fitOP)
> 
> 
> Program OUTPUT:
>> fit1<-glm(PHYSFUNC~HIV,data=KA)
>> summary(fit1)
> 
> Call:
> glm(formula = PHYSFUNC ~ HIV, data = KA)
> 
> Deviance Residuals: 
>    Min      1Q  Median      3Q     Max  
> -4.197  -4.192  -2.192   2.808  19.808  
> 
> Coefficients:
>             Estimate Std. Error t value Pr(>|t|)    
> (Intercept)  4.19670    0.08508   49.33   <2e-16 ***
> HIV         -0.00487    0.12071   -0.04    0.968    
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
> 
> (Dispersion parameter for gaussian family taken to be 22.78134)
> 
>     Null deviance: 142429  on 6253  degrees of freedom
> Residual deviance: 142429  on 6252  degrees of freedom
>   (213 observations deleted due to missingness)
> AIC: 37302
> 
> Number of Fisher Scoring iterations: 2
> 
>> fitQP<-glm(PHYSFUNC~HIV,data=KA,family=quasipoisson)
>> summary(fitQP)
> 
> Call:
> glm(formula = PHYSFUNC ~ HIV, family = quasipoisson, data = KA)
> 
> Deviance Residuals: 
>    Min      1Q  Median      3Q     Max  
> -2.897  -2.895  -1.193   1.250   6.644  
> 
> Coefficients:
>              Estimate Std. Error t value Pr(>|t|)    
> (Intercept)  1.434297   0.020280   70.72   <2e-16 ***
> HIV         -0.001161   0.028780   -0.04    0.968    
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
> 
> (Dispersion parameter for quasipoisson family taken to be 5.432011)
> 
>     Null deviance: 35439  on 6253  degrees of freedom
> Residual deviance: 35439  on 6252  degrees of freedom
>   (213 observations deleted due to missingness)
> AIC: NA
> 
> Number of Fisher Scoring iterations: 5
> 
>> anova(fit1,fitQP)
> Analysis of Deviance Table
> 
> Model 1: PHYSFUNC ~ HIV
> Model 2: PHYSFUNC ~ HIV
>   Resid. Df Resid. Dev   Df Deviance
> 1      6252     142429              
> 2      6252      35439    0   106989
> 
> 
> Thanks,
> John
> 
> 
> 
> 
> 
> John David Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
> 
> Confidentiality Statement:
> This email message, including any attachments, is for th...{{dropped:6}}
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list