[R] gee and geepack: different results?

Thomas Lumley tlumley at u.washington.edu
Fri Oct 24 20:31:37 CEST 2003


On Fri, 24 Oct 2003 mendola at dssm.unipa.it wrote:

> Hi, I downloaded both gee and geepack, and I am trying to understand the
> differences between the two libraries.
> I used the same data and estimated the same model, with a correlation
> structure autoregressive of order 1. Surprisingly for me, I found very
> different results. Coefficients are slightly different in value but
> sometimes opposite in sign.
> Moreover, the estimate of rho (correlation coefficient) given by gee is
> 0.5759 (see element 1.2 in the working correlation matrix) while the
> estimate given by geese is 0.4519.
> Could somebody explain me what happened?

These results don't look all that different. The one that changed sign is
pretty close to zero.

The two packages are using different estimators for the correlation
parameter, and therefore different weights for the observations. This is a
widespread issue with GEE.

In reasonably large samples, if the AR-1 model is a good fit then the
different correlation estimators will give similar values, so the weights
will be similar and the regression coefficients will be similar.
Alternatively, if the mean model is a good fit then the coefficients will
be similar regardless of correlation model.  With a sample of size 6,
however, GEE really isn't advisable.  I'd probably try a linear mixed
model for log(VCRE) unless you have strong reasons for preferring to model
the reciprocal of the mean.

Note in particular that the standard errors and p-values can be quite
badly wrong with such a small sample size in GEE.

	-thomas

>
>
> Data are in  dati22, which is only a part of the data for the study I
> intend to perform. Here what I did ,using first gee and then geese:
>
> > str(dati22)
> `data.frame':   47 obs. of  7 variables:
>  $ TR    : Factor w/ 4 levels "7","8","9","11": 1 1 1 1 1 1 1 1 1 1 ...
>  $ STAT  : Factor w/ 3 levels "1","2","3": 1 1 1 1 2 2 2 2 3 3 ...
>  $ PIANTA: int  2 2 2 2 2 2 2 2 41 41 ...
>  $ ANLEP : int  1999 1998 1999 1998 1999 1998 1999 1998 1999 1998 ...
>  $ eta   : int  12 11 10 9 11 10 11 10 14 13 ...
>  $ VCRE  : num  5.3 6.9 11 9.9 7.9 9.2 14.2 11.9 10.5 10 ...
>  $ temp  : num  19.7 20.0 19.7 20.0 19.7 ...
>
>
> > mio1.2<- gee(VCRE ~ TR + STAT + temp + eta, id = PIANTA, +
>  + data = dati22,  family = Gamma, corstr = "AR-M", Mv = 1)
>
> > summary(mio1.2)
>
>  GEE:  GENERALIZED LINEAR MODELS FOR DEPENDENT DATA
>  gee S-function, version 4.13 modified 98/01/27 (1998)
>
> Model:
>  Link:                      Reciprocal
>  Variance to Mean Relation: Gamma
>  Correlation Structure:     AR-M , M = 1
>
> Call:
> gee(formula = VCRE ~ TR + STAT + temp + eta, id = PIANTA, data = dati22,
>     family = Gamma, corstr = "AR-M", Mv = 1)
>
> Summary of Residuals:
>        Min         1Q     Median         3Q        Max
> -4.5154238 -2.0766622 -0.2153521  0.9418182  6.3871421
>
>
> Coefficients:
>                 Estimate  Naive S.E.     Naive z Robust S.E.    Robust z
> (Intercept)  0.411995251 0.174052364  2.36707644 0.131632816  3.12988253
> TR8         -0.001154422 0.021191593 -0.05447549 0.011807163 -0.09777305
> TR9          0.019559907 0.024379471  0.80231056 0.008993803  2.17482050
> TR11        -0.041092894 0.021609580 -1.90160537 0.015384050 -2.67113620
> STAT2        0.023886745 0.014219390  1.67987130 0.013543550  1.76369899
> STAT3        0.045749728 0.016844262  2.71604237 0.012862504  3.55682917
> temp        -0.020141682 0.008819798 -2.28368975 0.006851784 -2.93962600
> eta          0.008021081 0.001465212  5.47434944 0.001129986  7.09838725
>
> Estimated Scale Parameter:  0.1049601
> Number of Iterations:  13
>
> Working Correlation
>        [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]
> [1,] 1.0000 0.5758 0.3315 0.1909 0.1099 0.0633 0.0364 0.0210
> [2,] 0.5758 1.0000 0.5758 0.3315 0.1909 0.1099 0.0633 0.0364
> [3,] 0.3315 0.5758 1.0000 0.5758 0.3315 0.1909 0.1099 0.0633
> [4,] 0.1909 0.3315 0.5758 1.0000 0.5758 0.3315 0.1909 0.1099
> [5,] 0.1099 0.1909 0.3315 0.5758 1.0000 0.5758 0.3315 0.1909
> [6,] 0.0633 0.1099 0.1909 0.3315 0.5758 1.0000 0.5758 0.3315
> [7,] 0.0364 0.0633 0.1099 0.1909 0.3315 0.5758 1.0000 0.5758
> [8,] 0.0210 0.0364 0.0633 0.1099 0.1909 0.3315 0.5758 1.0000
>
> USING geese:
>
> > mio1.2pack<-geese(VCRE ~ TR + STAT + temp + eta, id = PIANTA, data =
> dati22, +
>     + family = Gamma, corstr = "ar1")
>
>
> Coefficients:
>             estimate  san.se   wald        p
> (Intercept)  0.44799 0.18279  6.007 1.43e-02
> TR8          0.00444 0.00985  0.203 6.52e-01
> TR9          0.02182 0.00632 11.923 5.54e-04
> TR11        -0.03728 0.01407  7.024 8.04e-03
> STAT2        0.02116 0.01476  2.056 1.52e-01
> STAT3        0.04270 0.01245 11.758 6.06e-04
> temp        -0.02233 0.00973  5.273 2.17e-02
> eta          0.00865 0.00136 40.683 1.79e-10
>
> Scale Model:
>  Scale Link:                identity
>
>  Estimated Scale Parameters:
>             estimate san.se wald       p
> (Intercept)    0.081 0.0287 7.98 0.00474
>
> Correlation Model:
>  Correlation Structure:     ar1
>  Correlation Link:          identity
>
>  Estimated Correlation Parameters:
>       estimate san.se wald      p
> alpha    0.452  0.272 2.77 0.0963
>
> Returned Error Value:    0
> Number of clusters:   6   Maximum cluster size: 8
>
>
> Thanks, Daria
>
>
> ****************************
>
> Daria Mendola
> Department of Statistics and Matematics
> "Silvio Vianelli"
> University di Palermo
> Viale delle Scienze - Edificio 13
> 90128 Palermo, Italy
> phone  +39 091 6626210
> fax  +39 091 485726
> email: mendola at dssm.unipa.it
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>

Thomas Lumley			Assoc. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle




More information about the R-help mailing list