[R] Precision in R

adelmaas@musc.edu adelmaas at musc.edu
Thu Jul 22 16:02:28 CEST 2004


On 22 Jul, at 06:09, r-help-request at stat.math.ethz.ch wrote:

> Message: 5
> Date: Wed, 21 Jul 2004 13:48:53 +0200
> From: bhx2 at mevik.net ( Bj?rn-Helge Mevik )
> Subject: Re: [R] Precision in R
> To: r-help at stat.math.ethz.ch
> Message-ID: <m0llhdbmxm.fsf at bar.nemo-project.org>
> Content-Type: text/plain; charset=iso-8859-1
>
> Since you didn't say anything about _what_ you did, either in SAS or
> R, my first thought was:  Have you checked that you use the same
> parametrization of the models in R and SAS?

Well, I'm running Poisson regressions for the incidence of childhood 
acute lymphoblastic leukemia in a set of US counties (and in this data 
set, for some reason, Hawaii counts as an entire county).  Separate 
models are calculated for males and females.  Independent variable of 
interest are race ("white", "black", "other") and (in the model for 
males only) -log(proportion of people in county who moved between 1985 
and 1990) (AKA "minus log proportion moved" or "MLPM").

SAS code:
> title "Males";
> proc genmod data=males order=formatted;
>         class race sex;
>         model observed = race mlpm*mlpm*mlpm mlpm*mlpm mlpm / 
> dist=poisson link=log offset=lPYAR covb;
>
> run;
>
> title "Females";
> proc genmod data=females order=formatted;
>         class race sex;
>         model observed = race / dist=poisson link=log offset=lPYAR;
> run;

R code:
> Female.model <- glm(Observed ~ Black + Other, family = 
> poisson(link=log), offset=log(PYAR), data=Females)
>
> Male.model <- glm(Observed ~ Black + Other + 
> I(Minus.log.proportion.moved^3) + I(Minus.log.proportion.moved^2) + 
> Minus.log.proportion.moved, family = poisson(link=log), 
> offset=log(PYAR), data=Males)

The difference in how race is included in the models is due to me 
wanting both programs to use "whites" as the referent group (seeing as 
I have more data from them than "blacks" and "others").

SAS results:
>                                               Males           12:08 
> Wednesday, April 21, 2004 173
>
>                                       The GENMOD Procedure
>
>                                        Model Information
>
>                                 Data Set              WORK.MALES
>                                 Distribution             Poisson
>                                 Link Function                Log
>                                 Dependent Variable      Observed
>                                 Offset Variable            lPYAR
>                                 Observations Used            526
>
>
>                                      Class Level Information
>
>                                    Class      Levels    Values
>
>                                    Race            3    B O W
>                                    Sex             1    M
>
>
>                                      Parameter Information
>
>                              Parameter       Effect            Race
>
>                              Prm1            Intercept
>                              Prm2            Race              B
>                              Prm3            Race              O
>                              Prm4            Race              W
>                              Prm5            mlPM*mlPM*mlPM
>                              Prm6            mlPM*mlPM
>                              Prm7            mlPM
>
>
>                              Criteria For Assessing Goodness Of Fit
>
>                   Criterion                 DF           Value        
> Value/DF
>
>                   Deviance                 520        239.5025         
>  0.4606
>                   Scaled Deviance          520        239.5025         
>  0.4606
>                   Pearson Chi-Square       520        360.5677         
>  0.6934
>                   Scaled Pearson X2        520        360.5677         
>  0.6934
>                   Log Likelihood                      320.5910
>
> 
>                                               Males           12:08 
> Wednesday, April 21, 2004 174
>
>                                       The GENMOD Procedure
>
>            Algorithm converged.
>
>
>                                   Estimated Covariance Matrix
>
>                 Prm1           Prm2           Prm3           Prm5      
>      Prm6           Prm7
>
>  Prm1        9.25071       -0.01841        0.04877      -13.71192      
>  37.88798      -33.20414
>  Prm2       -0.01841        0.03392       0.002521        0.03045      
>  -0.07720        0.06191
>  Prm3        0.04877       0.002521        0.02027       -0.07622      
>   0.21457       -0.18748
>  Prm5      -13.71192        0.03045       -0.07622       22.11044      
> -59.26190       50.49281
>  Prm6       37.88798       -0.07720        0.21457      -59.26190      
>    160.70        -138.32
>  Prm7      -33.20414        0.06191       -0.18748       50.49281      
>   -138.32         120.18
>
>
>                                 Analysis Of Parameter Estimates
>
>                                         Standard   Wald 95% Confidence 
>      Chi-
>    Parameter            DF   Estimate      Error          Limits       
>    Square   Pr > ChiSq
>
>    Intercept             1   -15.8294     3.0415   -21.7907    -9.8682 
>     27.09       <.0001
>    Race             B    1    -0.6646     0.1842    -1.0256    -0.3036 
>     13.02       0.0003
>    Race             O    1    -0.1058     0.1424    -0.3848     0.1733 
>      0.55       0.4575
>    Race             W    0     0.0000     0.0000     0.0000     0.0000 
>       .          .
>    mlPM*mlPM*mlPM        1    15.4205     4.7022     6.2044    24.6366 
>     10.75       0.0010
>    mlPM*mlPM             1   -36.8423    12.6768   -61.6884   -11.9961 
>      8.45       0.0037
>    mlPM                  1    27.2989    10.9627     5.8124    48.7855 
>      6.20       0.0128
>    Scale                 0     1.0000     0.0000     1.0000     1.0000
>
> NOTE: The scale parameter was held fixed.
>
> 
>                                              Females          12:08 
> Wednesday, April 21, 2004 175
>
>                                       The GENMOD Procedure
>
>                                        Model Information
>
>                                Data Set              WORK.FEMALES
>                                Distribution               Poisson
>                                Link Function                  Log
>                                Dependent Variable        Observed
>                                Offset Variable              lPYAR
>                                Observations Used              534
>
>
>                                      Class Level Information
>
>                                    Class      Levels    Values
>
>                                    Race            3    B O W
>                                    Sex             1    F
>
>
>                              Criteria For Assessing Goodness Of Fit
>
>                   Criterion                 DF           Value        
> Value/DF
>
>                   Deviance                 531        245.2305         
>  0.4618
>                   Scaled Deviance          531        245.2305         
>  0.4618
>                   Pearson Chi-Square       531        484.8219         
>  0.9130
>                   Scaled Pearson X2        531        484.8219         
>  0.9130
>                   Log Likelihood                      183.8640
>
>
>            Algorithm converged.
>
>
>                                  Analysis Of Parameter Estimates
>
>                                       Standard     Wald 95% Confidence 
>       Chi-
>   Parameter         DF    Estimate       Error           Limits        
>     Square    Pr > ChiSq
>
>   Intercept          1     -9.7630      0.0577     -9.8762     -9.6499 
>    28595.0        <.0001
>   Race         B     1     -1.0917      0.2493     -1.5803     -0.6030 
>      19.17        <.0001
>   Race         O     1      0.0014      0.1569     -0.3061      0.3088 
>       0.00        0.9931
>   Race         W     0      0.0000      0.0000      0.0000      0.0000 
>        .           .
>
> 
>                                              Females          12:08 
> Wednesday, April 21, 2004 176
>
>                                       The GENMOD Procedure
>
>                                  Analysis Of Parameter Estimates
>
>                                       Standard     Wald 95% Confidence 
>       Chi-
>   Parameter         DF    Estimate       Error           Limits        
>     Square    Pr > ChiSq
>
>   Scale              0      1.0000      0.0000      1.0000      1.0000
>
> NOTE: The scale parameter was held fixed.

R results:
> > summary(Female.model)
>
> Call:
> glm(formula = Observed ~ Black + Other, family = poisson(link = log),
>     data = Females, offset = log(PYAR))
>
> Deviance Residuals:
>     Min       1Q   Median       3Q      Max
> -2.4060  -0.5315  -0.1109  -0.0284   2.6520
>
> Coefficients:
>              Estimate Std. Error  z value Pr(>|z|)
> (Intercept) -9.763025   0.057735 -169.101  < 2e-16 ***
> BlackTRUE   -1.091679   0.249309   -4.379 1.19e-05 ***
> OtherTRUE    0.001363   0.156876    0.009    0.993
> ---
> Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
>
> (Dispersion parameter for poisson family taken to be 1)
>
>     Null deviance: 272.49  on 533  degrees of freedom
> Residual deviance: 245.23  on 531  degrees of freedom
> AIC: 520.71
>
> Number of Fisher Scoring iterations: 7
>
> > summary(Male.model)
>
> Call:
> glm(formula = Observed ~ Black + Other + 
> I(Minus.log.proportion.moved^3) +
>     I(Minus.log.proportion.moved^2) + Minus.log.proportion.moved,
>     family = poisson(link = log), data = Males, offset = log(PYAR))
>
> Deviance Residuals:
>      Min        1Q    Median        3Q       Max
> -2.24568  -0.49137  -0.10197  -0.03262   3.88346
>
> Coefficients:
>                                  Estimate Std. Error z value Pr(>|z|)
> (Intercept)                     -16.39065    3.31644  -4.942 7.72e-07 
> ***
> BlackTRUE                        -0.66461    0.18418  -3.608 0.000308 
> ***
> OtherTRUE                        -0.09513    0.14278  -0.666 0.505245
> I(Minus.log.proportion.moved^3)  24.39920    7.51188   3.248 0.001162 
> **
> I(Minus.log.proportion.moved^2) -51.17011   17.75857  -2.881 0.003959 
> **
> Minus.log.proportion.moved       33.48773   13.52491   2.476 0.013286 *
> ---
> Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
>
> (Dispersion parameter for poisson family taken to be 1)
>
>     Null deviance: 278.68  on 525  degrees of freedom
> Residual deviance: 240.54  on 520  degrees of freedom
> AIC: 582.68
>
> Number of Fisher Scoring iterations: 6

Now, you'll notice (after scrolling up and down a lot) that the models 
for females have identical results, but the models for males have 
different results.  Anybody have any ideas why I'm getting a difference 
and which program (if either) is giving me the right answer?  Thanks in 
advance again.

Aaron

-------------
Aaron Solomonâ­ (â¬ben Saul Josephâ­) â¬Adelman
E-mailâ­:  â¬adelmaas at musc.edu
Web siteâ­:  â¬httpâ­://â¬people.musc.eduâ­/â¬~adelmaasâ­/â¬
AOL Instant Messengerâ­ & â¬Yahooâ­! â¬Messenger:  â¬Hiergargo
AIM chat-room (preferred):  Adelmania




More information about the R-help mailing list