[R-sig-ME] Very odd parameter estimates using GEE with AR-1 correlation structure

Chris Howden chris at trickysolutions.com.au
Mon May 21 01:47:01 CEST 2012


I can't comment much on GEE's but I believe u can use mixed models for
population inference, if correctly specified and interpreted.

I think when predicting though u need to use only the population level
parameters and not the resp level ones.

Others will know more about this than I and can likely comment or
suggest relevant papers.

Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP Commercialisation and
Innovation, Data Analysis, Modelling and Training

(mobile) 0410 689 945
(fax / office)
chris at trickysolutions.com.au

Disclaimer: The information in this email and any attachments to it are
confidential and may contain legally privileged information. If you are not
the named or intended recipient, please delete this communication and
contact us immediately. Please note you are not authorised to copy,
use or disclose this communication or any attachments without our
consent. Although this email has been checked by anti-virus software,
there is a risk that email messages may be corrupted or infected by
viruses or other
interferences. No responsibility is accepted for such interference. Unless
expressly stated, the views of the writer are not those of the
company. Tricky Solutions always does our best to provide accurate
forecasts and analyses based on the data supplied, however it is
possible that some important predictors were not included in the data
sent to us. Information provided by us should not be solely relied
upon when making decisions and clients should use their own judgement.

On 19/05/2012, at 4:22, Anne Bjorkman <annebj at gmail.com> wrote:

> Hello mixed modelers,
>
> I am having problems with some GEE models I am trying to run using geepack.
>
> I have species abundance data for 52 different species in 154 sites over 47
> years, and I am trying to extract slope parameter estimates so that I can
> look at whether these species have increased or decreased in abundance over
> time, while taking into account the repeated measurements at each site over
> time.  I originally started doing this with mixed models, but have been
> advised that GEE would be more appropriate for my data as it gives
> population-averaged responses.
>
> However, when I try to run GEE's on my data I get really bizarre parameter
> estimates for some of my species.  As my dataset is huge I unfortunately
> cannot provide the whole thing, but I have uploaded a subset of the data
> for one species with a particularly bizarre slope parameter estimate here:
> http://dl.dropbox.com/u/4481861/Example_for_GEE_one_species.csv
>
> The data look like this:
>
>  Site Year Species Value_Pres Value_Abs
> 1    1 1961       1          0            2089
> 2    1 1962       1          0            2120
> 3    1 1963       1          0            2089
> 4    1 1964       1          0            2225
> 5    1 1965       1          0            2197
> 6    1 1966       1          0            2208
>
> I have been using the following model specification (I have been running a
> loop to calculate estimates for all 52 species separately, but this is for
> just one species):
>
>
> speciesA<-orderBy(~Site+Year,data=speciesA) #using the doBy package to
> order by subject then time
>
>
> speciesA.mod<-geeglm(cbind(Value_Pres,Value_Abs)~I(Year-1961),data=speciesA,
> family=binomial,id=Site,corstr="ar1")
>
>
> Call:
>
> geeglm(formula = cbind(Value_Pres, Value_Abs) ~ I(Year - 1961),
>
>    family = binomial, data = speciesA, id = Site, corstr = "ar1")
>
>
> Coefficients:
>
>                Estimate   Std.err    Wald Pr(>|W|)
>
> (Intercept)    -2.99e+14  9.10e+11  107705   <2e-16 ***
>
> I(Year - 1961) -9.62e+13  3.88e+10 6155147   <2e-16 ***
>
> ---
>
> Signif. codes:  0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1
>
>
> Estimated Scale Parameters:
>
>            Estimate  Std.err
>
> (Intercept) 6.57e+10 5.62e+30
>
>
> Correlation: Structure = ar1  Link = identity
>
>
> Estimated Correlation Parameters:
>
>      Estimate Std.err
>
> alpha     0.98 4.4e+18
>
> Number of clusters:   154   Maximum cluster size: 47
>
>
> I suspect the problem might have something to do with the correlation
> structure, as species abundances in subsequent years are often very highly
> correlated, even if there is substantial change over the 47 years overall.
> If I use the corstr="independence" command I get parameter estimate that
> are very similar to those I got using mixed effects models (at least, the
> slopes for species responses relative to each other are similar).
> Furthermore, if I use corstr="ar1" but subset my data to every 5 years
> instead of every year, I get much more reasonable slope estimates for this
> particular species as well as most of the other species (slope values are
> very similar to the corstr="independence" value), but a few different
> species' slopes then get very weird. (By get weird I mean that they have
> abnormally large positive or negative slopes that don't reflect what's
> happening in the raw data at all).
>
>
> I would really appreciate some insight into what the problem with my data
> could be, or, more particularly, how to fix it! My head and my wall would
> be very grateful! Perhaps I should just give up on GEE's and go back to
> mixed models??
>
>
> Thanks very much,
>
> Anne
>
>    [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models



More information about the R-sig-mixed-models mailing list