[R-sig-eco] compar.gee- Generalized estimating equations and inflated type 1 errors

Tom Oliver toliver at ceh.ac.uk
Tue Feb 17 17:27:28 CET 2009


Hello Helplist,

I was wondering if anyone could shed some light on this problem..

I have been using the compar.gee package (Package ape version 2.2) for a comparative analaysis using a binary categorical variable.
My phylogenetic tree has 42 tips with branch lengths set to 1 and the response variable I am using is slightly negatively skewed (i.e. non-normal). However, from background reading apparently GEEs are fairly robust to such skew.

The output suggests my explanatory variable is highly correlated with my response across species (output #1 below). When I run a normal anova, however, my result is nowhere near the p<0.05 significance level (output #2 below). I was quite suspicious about the results so I decided to randomly shuffle the response and explanatory vectors and repeat the tests 30 times. For the normal ANOVA 2 out of 30 results were significant (p<0.05 without bonferroni). But for the GEE, 17 out of 30 results were still significant.

Clearly something is not quite right! I have also tried using a continuous explanatory variable and the GEE model still seems to overestimate significance. I can find very little material detailing the use or limits of GEEs in R. The R package help page contains no information on GEE assumptions etc., nor does the book by E.Paradis- Analysis of Phylogenetics and Evolution with R. 

I have noticed that if I specificy the GEE regressions to go through the origin (output #3)  then the results are far less significant. Indeed, zero out of the 30 tests are significant. However, in the  example in the compar.gee help page it suggests that GEE regressions NOT through the origin give more comparable results to phylogentically independent contrasts (?pic help page example).

Any help you can offer is much appreciated.
Tom Oliver


######## 

output #1 

print(compar.gee(slope~explanatory,data2,phy=tr2))
Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
running glm to get initial regression estimate
 (Intercept) explanatoryy 
 -0.22694925   0.02282178 

Call:
  formula: slope ~ explanatory

Number of observations:  42 

Model:
 Link:                      identity 
 Variance to Mean Relation: gaussian 

Summary of Residuals:
       Min         1Q     Median         3Q        Max 
-6.1949077 -0.5137222  0.2786307  0.5084768  3.7319044 


Coefficients:
               Estimate      S.E.          t  Pr(T > |t|)
(Intercept)   0.3961744 0.7632086  0.5190905 0.6241267392
explanatoryy -0.8182404 0.1159283 -7.0581610 0.0006200716

Estimated Scale Parameter:  2.100696
"Phylogenetic" df (dfP):  7.439024 

#######'################

output #2 

print(summary(lm(slope~explanatory)))

Call:
lm(formula = slope ~ explanatory)

Residuals:
    Min      1Q  Median      3Q     Max 
-6.4128 -0.0539  0.1215  0.2905  3.5140 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept)  -0.22695    0.38631  -0.587    0.560
explanatoryy  0.02282    0.46490   0.049    0.961

Residual standard error: 1.393 on 40 degrees of freedom
Multiple R-squared: 6.024e-05,  Adjusted R-squared: -0.02494 
F-statistic: 0.00241 on 1 and 40 DF,  p-value: 0.961 

##############

output #3 

 print(compar.gee(slope~explanatory-1,data2,phy=tr2))
Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27
running glm to get initial regression estimate
explanatoryn explanatoryy 
  -0.2269493   -0.2041275 

Call:
  formula: slope ~ explanatory - 1

Number of observations:  42 

Model:
 Link:                      identity 
 Variance to Mean Relation: gaussian 

Summary of Residuals:
       Min         1Q     Median         3Q        Max 
-6.1949077 -0.5137222  0.2786307  0.5084768  3.7319044 


Coefficients:
               Estimate      S.E.          t Pr(T > |t|)
explanatoryn  0.3961744 0.7632086  0.5190905   0.6241267
explanatoryy -0.4220661 0.7595631 -0.5556695   0.6005236

Estimated Scale Parameter:  2.100696
"Phylogenetic" df (dfP):  7.439024 
> 
########
end




Tom Oliver
Biological Records Centre
Centre for Ecology and Hydrology (CEH)
Maclean Building, 
Benson Lane,
Crowmarsh Gifford,
Wallingford, Oxfordshire, OX10 8BB
Tel: 01491 692517

-- 
This message (and any attachments) is for the recipient ...{{dropped:6}}



More information about the R-sig-ecology mailing list