[R] conducting GAM-GEE within gamm4?
Gavin Simpson
gavin.simpson at ucl.ac.uk
Tue May 8 11:34:18 CEST 2012
On Thu, 2012-05-03 at 08:32 -0500, Nathan Furey wrote:
> Dear R-help users,
>
> I am trying to analyze some visual transect data of organisms to generate a
> habitat distribution model. Once organisms are sighted, they are followed
> as point data is collected at a given time interval. Because of the
> autocorrelation among these "follows," I wish to utilize a GAM-GEE approach
> similar to that of Pirotta et al. 2011, using packages 'yags' and 'splines'
> (http://www.int-res.com/abstracts/meps/v436/p257-272/). Their R scripts
> are shown here (
> http://www.int-res.com/articles/suppl/m436p257_supp/m436p257_supp1-code.r).
> I have used this code with limited success and multiple issues of models
> failing to converge.
>
<snip />
> Following this example, the following is the code I used for my dataset
>
> b <- gamm4(dolphin_presence~s(dist_slag)+s(Depth),random=(form=~1|block),
> family=binomial(),data=dat)
I'm not familiar with gamm4 but a quick glance at the manual and
examples suggests to me that you are expressing the random effects
argument `random` incorrectly, perhaps because you were reading an
example for the gamm() function in package mgcv?
`random` expects a formula so just pass it as `random = ~ (1|block)`.
The reason for `form = ~ 1 | block` in gamm() is that the corAR1()
function for AR(1) correlation structures has more arguments than just
the `formula` one and the `formula` argument is not the first argument
to that function. But this has nothing to do with the `random` argument
of the gamm4() function.
If you make the above change, gamm4() should take into account the
grouping structure of your data.
HTH
G
> However, by examining the output (summary(b$gam)) and specifically
> summary(b$mer)), I am either unsure of how to interpret the results,
> or do not believe that the autocorrelation within the group is being
> taken into consideration.
>
>
> > summary(b$gam)
>
> Family: binomial
> Link function: logit
>
> Formula:
> dolphin_presence ~ s(dist_slag) + s(Depth)
>
> Parametric coefficients:
>
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -13.968 5.145 -2.715 0.00663 **
> ---
> Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
>
> Approximate significance of smooth terms:
>
> edf Ref.df Chi.sq p-value
> s(dist_slag) 4.943 4.943 70.67 6.85e-14 ***
> s(Depth) 6.869 6.869 115.59 < 2e-16 ***
> ---
> Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
>
> R-sq.(adj) = 0.317glmer.ML score = 10504 Scale est. = 1 n = 10792
> >
>
> > summary(b$mer)
> Generalized linear mixed model fit by the Laplace approximation
>
> AIC BIC logLik deviance
> 10514 10551 -5252 10504
> Random effects:
> Groups Name Variance Std.Dev.
> Xr s(dist_slag) 1611344 1269.39
> Xr.0 s(Depth) 98622 314.04
> Number of obs: 10792, groups: Xr, 8; Xr.0, 8
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> X(Intercept) -13.968 5.145 -2.715 0.00663 **
> Xs(dist_slag)Fx1 -35.871 33.944 -1.057 0.29063
> Xs(Depth)Fx1 3.971 3.740 1.062 0.28823
>
> ---
> Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
>
> Correlation of Fixed Effects:
> X(Int) X(_)F1
> Xs(dst_s)F1 0.654
> Xs(Dpth)Fx1 -0.030 0.000
> >
>
>
> How do I ensure that autocorrelation is indeed being accounted for
> within each unique value of the "block" variable? What is the simplest
> way to interpret the output for "summary(b$mer)"?
>
> The results do differ from a normal gam (package mgcv) using the same
> variables and parameters without the "correlation=..." term,
> indicating that *something *different is occurring.
>
> However, when I use a different variable for the correlation term
> (season), I get the SAME output:
>
> > dat2 <- data.frame(dist_slag = dat$dist_slag, Depth = dat$Depth, dolphin_presence = dat$dolphin_presence,
> + block = dat$block, season=dat$season)
> > head(dat2)
> dist_slag Depth dolphin_presence block season
> 1 26475.47 -10.0934 0 1 F
> 2 26340.47 -10.4870 0 1 F
> 3 25886.33 -16.5752 0 1 F
> 4 25399.88 -22.2474 0 1 F
> 5 24934.29 -29.6797 0 1 F
> 6 24519.90 -26.2370 0 1 F
>
> > summary(dat2$season)
> F S
> 3224 7568
>
>
> > b <- gamm4(dolphin_presence~s(dist_slag)+s(Depth),correlation=corAR1(1, form=~1 | season), family=binomial(),data=dat2)
> > summary(b$gam)
>
> Family: binomial
> Link function: logit
>
> Formula:
> dolphin_presence ~ s(dist_slag) + s(Depth)
>
> Parametric coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -13.968 5.145 -2.715 0.00663 **
> ---
> Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
>
> Approximate significance of smooth terms:
> edf Ref.df Chi.sq p-value
> s(dist_slag) 4.943 4.943 70.67 6.85e-14 ***
> s(Depth) 6.869 6.869 115.59 < 2e-16 ***
> ---
> Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
>
> R-sq.(adj) = 0.317glmer.ML score = 10504 Scale est. = 1 n = 10792
> > summary(b$mer)
> Generalized linear mixed model fit by the Laplace approximation
> AIC BIC logLik deviance
> 10514 10551 -5252 10504
> Random effects:
> Groups Name Variance Std.Dev.
> Xr s(dist_slag) 1611344 1269.39
> Xr.0 s(Depth) 98622 314.04
> Number of obs: 10792, groups: Xr, 8; Xr.0, 8
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> X(Intercept) -13.968 5.145 -2.715 0.00663 **
> Xs(dist_slag)Fx1 -35.871 33.944 -1.057 0.29063
> Xs(Depth)Fx1 3.971 3.740 1.062 0.28823
> ---
> Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
>
> Correlation of Fixed Effects:
> X(Int) X(_)F1
> Xs(dst_s)F1 0.654
> Xs(Dpth)Fx1 -0.030 0.000
> >
>
>
> I just want to make sure it is correctly allowing for correlation
> within each value for the "block" variable. How do I formulate the
> model to say that autocorrelation can exist within each single value
> for block, but assume independence among blocks?
>
>
> On another note, I am also receiving the following warning message
> after model completion for larger models (with many more variables
> than 2):
>
> Warning message:
> In mer_finalize(ans) : false convergence (8)
>
>
>
>
> Thank you for your time, effort, and patience dealing with a novice.
>
> Sincerely,
>
> Nathan
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin Simpson [t] +44 (0)20 7679 0522
ECRC, UCL Geography, [f] +44 (0)20 7679 0565
Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
More information about the R-help
mailing list