[R-sig-eco] low predicted vales in GAMs (Anna Renwick)

Highland Statistics Ltd. highstat at highstat.com
Sat Dec 12 12:27:46 CET 2009


> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 11 Dec 2009 11:43:40 -0000
> From: "Anna Renwick" <anna.renwick at bto.org>
> Subject: [R-sig-eco] low predicted vales in GAMs
> To: <r-sig-ecology at r-project.org>
> Message-ID: <BFD6DF2C5CA142C58C272652FA017856 at btodomain.bto.org>
> Content-Type: text/plain
>
> Dear All
>
>  
>
> I have come across a problem with the GAM models I am running. Basically the
> predicted values are consistently only about 0.4 of the actual values. 
>
>  
>
> A bit more detail:
>
> MODEL:
>
> m4<-gam(count~s(east,north,k=10)+ez+cv01+cv03+cv04+cv05+cv07+mtemp+mtotalrai
> n+ez:mtemp+ez:mtotalrain+
>
>             offset(log(fit.vec)),
>
>             weights=wt,
>
>             data=spat6,
>
>             family=quasipoisson,
>
>             start=rep(0,26)
>
> )
>
> MODEL SUMMARY:
>
>  
>
> Family: quasipoisson 
>
> Link function: log 
>
>  
>
> Formula:
>
> count ~ s(east, north, k = 10) + ez + cv01 + cv03 + cv04 + cv05 + 
>
>     cv07 + mtemp + mtotalrain + ez:mtemp + ez:mtotalrain +
> offset(log(fit.vec))
>
>  
>
> Parametric coefficients:
>
>                  Estimate Std. Error   t value Pr(>|t|)    
>
> (Intercept)    -5.296e+00  1.846e+00    -2.869 0.004166 ** 
>
> ezM             1.651e+00  2.102e+00     0.785 0.432397    
>
> ezP             7.358e+00  2.047e+00     3.595 0.000332 ***
>
> ezU            -1.061e+02  1.064e+07 -9.97e-06 0.999992    
>
> cv01            7.405e-02  5.437e-03    13.620  < 2e-16 ***
>
> cv03            2.258e-02  5.145e-03     4.389 1.20e-05 ***
>
> cv04            2.878e-02  4.839e-03     5.949 3.18e-09 ***
>
> cv05            3.634e-02  5.326e-03     6.823 1.17e-11 ***
>
> cv07            2.370e-02  5.712e-03     4.149 3.48e-05 ***
>
> mtemp          -1.838e-01  1.750e-01    -1.050 0.293900    
>
> mtotalrain      1.872e-02  5.072e-03     3.692 0.000229 ***
>
> ezM:mtemp       6.181e-02  2.204e-01     0.280 0.779197    
>
> ezP:mtemp      -7.028e-01  2.050e-01    -3.429 0.000619 ***
>
> ezU:mtemp       8.697e-01  1.371e+06  6.34e-07 0.999999    
>
> ezM:mtotalrain -3.393e-02  5.799e-03    -5.851 5.68e-09 ***
>
> ezP:mtotalrain -1.901e-02  5.379e-03    -3.535 0.000417 ***
>
> ezU:mtotalrain  3.510e-02  4.074e+04  8.62e-07 0.999999    
>
> ---
>
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
>
>  
>
> Approximate significance of smooth terms:
>
>                 edf Ref.df     F p-value    
>
> s(east,north) 8.736  8.736 28.88  <2e-16 ***
>
> ---
>
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
>
>  
>
> R-sq.(adj) =  0.324   Deviance explained = -5.12e+03%
>
> GCV score = 39.556  Scale est. = 39.056    n = 2038
>
>  
>
>  
>
> Count = bird counts/square
>
>   

Is this really an integer?


> ez=environmental zone
>
> cv = habitat types
>
> mtemp = mean annual temperature
>
> mtotalrain= mean total rain/year
>
>  
>
> Sample size is approximately 2000.
>
>  
>
> The offset fit.vec is bird detectability and the weighting is based on the
> number of squares in each area surveyed. I belief that the strange deviance
> explained is due to the weighting we have added into the model.
>
>   
Why would you use a weighting factor in a Poisson/quasi-Poisson GLM/GAM? 
See also the weights text for the help file for glm. Not sure what it 
would be doing.

>  
>
> I would have assumed that the predicted values divided by the real counts
> should be around 1, however they are much lower and hence the model is
> consistently predicting lower counts than were observed. I was wondering if
> there is anything obvious which I am missing when carrying out these models.
>
>   

you seem to have a very large overdispersion. But that is another 
problem. I think your number of squares should actually be used in the 
offset (the log obviously).

Alain

>  
>
> Many thanks,
>
> Anna
>
>  
>
> Dr Anna R. Renwick
> Research Ecologist
> British Trust for Ornithology, 
> The Nunnery, 
> Thetford, 
> Norfolk, 
> IP24 2PU, 
> UK
> Tel: +44 (0)1842 750050; Fax: +44 (0)1842 750030 
>
>  
>
>
> 	[[alternative HTML version deleted]]
>
>
>
> ------------------------------
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>
> End of R-sig-ecology Digest, Vol 21, Issue 12
> *********************************************
>
>   


-- 


Dr. Alain F. Zuur
First author of:

1. Analysing Ecological Data (2007).
Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.
URL: www.springer.com/0-387-45967-7


2. Mixed effects models and extensions in ecology with R. (2009).
Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer.
http://www.springer.com/life+sci/ecology/book/978-0-387-87457-9


3. A Beginner's Guide to R (2009).
Zuur, AF, Ieno, EN, Meesters, EHWG. Springer
http://www.springer.com/statistics/computational/book/978-0-387-93836-3


Other books: http://www.highstat.com/books.htm


Statistical consultancy, courses, data analysis and software
Highland Statistics Ltd.
6 Laverock road
UK - AB41 6FN Newburgh
Tel: 0044 1358 788177
Email: highstat at highstat.com
URL: www.highstat.com
URL: www.brodgar.com



More information about the R-sig-ecology mailing list