[R-sig-eco] low predicted vales in GAMs (Anna Renwick)
Anna Renwick
anna.renwick at bto.org
Wed Dec 16 13:25:55 CET 2009
Dear All
I wanted to thank everyone for their helpful comments. With your help, and
that of Simon Wood, I now realise that the reason I have low predicted
values is because I have so many zeros in my data. As the model structure I
have constructed specifies that the mean must always be positive then the
model over-predicts the zero counts and in order not to predict more counts
that there actually are it under-estimates the non zeros counts (this
underestimation can be quite large due to the high number of zeros).
So one thing I am thinking of is to try a zero-inflated model. I have looked
at the COZIGAM package but you do not seem to be able use an offset with it.
I was wondering if anybody knows of a package where weighted zero-inflated
GAM models with an offset can be run.
Many thanks,
Anna
Dr Anna R. Renwick
Research Ecologist
British Trust for Ornithology,
The Nunnery,
Thetford,
Norfolk,
IP24 2PU,
UK
Tel: +44 (0)1842 750050; Fax: +44 (0)1842 750030
-----Original Message-----
From: r-sig-ecology-bounces at r-project.org
[mailto:r-sig-ecology-bounces at r-project.org] On Behalf Of Highland
Statistics Ltd.
Sent: 12 December 2009 11:28
To: r-sig-ecology at r-project.org
Subject: Re: [R-sig-eco] low predicted vales in GAMs (Anna Renwick)
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 11 Dec 2009 11:43:40 -0000
> From: "Anna Renwick" <anna.renwick at bto.org>
> Subject: [R-sig-eco] low predicted vales in GAMs
> To: <r-sig-ecology at r-project.org>
> Message-ID: <BFD6DF2C5CA142C58C272652FA017856 at btodomain.bto.org>
> Content-Type: text/plain
>
> Dear All
>
>
>
> I have come across a problem with the GAM models I am running. Basically
the
> predicted values are consistently only about 0.4 of the actual values.
>
>
>
> A bit more detail:
>
> MODEL:
>
>
m4<-gam(count~s(east,north,k=10)+ez+cv01+cv03+cv04+cv05+cv07+mtemp+mtotalrai
> n+ez:mtemp+ez:mtotalrain+
>
> offset(log(fit.vec)),
>
> weights=wt,
>
> data=spat6,
>
> family=quasipoisson,
>
> start=rep(0,26)
>
> )
>
> MODEL SUMMARY:
>
>
>
> Family: quasipoisson
>
> Link function: log
>
>
>
> Formula:
>
> count ~ s(east, north, k = 10) + ez + cv01 + cv03 + cv04 + cv05 +
>
> cv07 + mtemp + mtotalrain + ez:mtemp + ez:mtotalrain +
> offset(log(fit.vec))
>
>
>
> Parametric coefficients:
>
> Estimate Std. Error t value Pr(>|t|)
>
> (Intercept) -5.296e+00 1.846e+00 -2.869 0.004166 **
>
> ezM 1.651e+00 2.102e+00 0.785 0.432397
>
> ezP 7.358e+00 2.047e+00 3.595 0.000332 ***
>
> ezU -1.061e+02 1.064e+07 -9.97e-06 0.999992
>
> cv01 7.405e-02 5.437e-03 13.620 < 2e-16 ***
>
> cv03 2.258e-02 5.145e-03 4.389 1.20e-05 ***
>
> cv04 2.878e-02 4.839e-03 5.949 3.18e-09 ***
>
> cv05 3.634e-02 5.326e-03 6.823 1.17e-11 ***
>
> cv07 2.370e-02 5.712e-03 4.149 3.48e-05 ***
>
> mtemp -1.838e-01 1.750e-01 -1.050 0.293900
>
> mtotalrain 1.872e-02 5.072e-03 3.692 0.000229 ***
>
> ezM:mtemp 6.181e-02 2.204e-01 0.280 0.779197
>
> ezP:mtemp -7.028e-01 2.050e-01 -3.429 0.000619 ***
>
> ezU:mtemp 8.697e-01 1.371e+06 6.34e-07 0.999999
>
> ezM:mtotalrain -3.393e-02 5.799e-03 -5.851 5.68e-09 ***
>
> ezP:mtotalrain -1.901e-02 5.379e-03 -3.535 0.000417 ***
>
> ezU:mtotalrain 3.510e-02 4.074e+04 8.62e-07 0.999999
>
> ---
>
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
>
>
> Approximate significance of smooth terms:
>
> edf Ref.df F p-value
>
> s(east,north) 8.736 8.736 28.88 <2e-16 ***
>
> ---
>
> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
>
>
>
> R-sq.(adj) = 0.324 Deviance explained = -5.12e+03%
>
> GCV score = 39.556 Scale est. = 39.056 n = 2038
>
>
>
>
>
> Count = bird counts/square
>
>
Is this really an integer?
> ez=environmental zone
>
> cv = habitat types
>
> mtemp = mean annual temperature
>
> mtotalrain= mean total rain/year
>
>
>
> Sample size is approximately 2000.
>
>
>
> The offset fit.vec is bird detectability and the weighting is based on the
> number of squares in each area surveyed. I belief that the strange
deviance
> explained is due to the weighting we have added into the model.
>
>
Why would you use a weighting factor in a Poisson/quasi-Poisson GLM/GAM?
See also the weights text for the help file for glm. Not sure what it
would be doing.
>
>
> I would have assumed that the predicted values divided by the real counts
> should be around 1, however they are much lower and hence the model is
> consistently predicting lower counts than were observed. I was wondering
if
> there is anything obvious which I am missing when carrying out these
models.
>
>
you seem to have a very large overdispersion. But that is another
problem. I think your number of squares should actually be used in the
offset (the log obviously).
Alain
>
>
> Many thanks,
>
> Anna
>
>
>
> Dr Anna R. Renwick
> Research Ecologist
> British Trust for Ornithology,
> The Nunnery,
> Thetford,
> Norfolk,
> IP24 2PU,
> UK
> Tel: +44 (0)1842 750050; Fax: +44 (0)1842 750030
>
>
>
>
> [[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>
> End of R-sig-ecology Digest, Vol 21, Issue 12
> *********************************************
>
>
--
Dr. Alain F. Zuur
First author of:
1. Analysing Ecological Data (2007).
Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.
URL: www.springer.com/0-387-45967-7
2. Mixed effects models and extensions in ecology with R. (2009).
Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer.
http://www.springer.com/life+sci/ecology/book/978-0-387-87457-9
3. A Beginner's Guide to R (2009).
Zuur, AF, Ieno, EN, Meesters, EHWG. Springer
http://www.springer.com/statistics/computational/book/978-0-387-93836-3
Other books: http://www.highstat.com/books.htm
Statistical consultancy, courses, data analysis and software
Highland Statistics Ltd.
6 Laverock road
UK - AB41 6FN Newburgh
Tel: 0044 1358 788177
Email: highstat at highstat.com
URL: www.highstat.com
URL: www.brodgar.com
_______________________________________________
R-sig-ecology mailing list
R-sig-ecology at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
More information about the R-sig-ecology
mailing list