[R] Two repeated warnings when running gam(mgcv) to analyse my dataset?

Tue Dec 18 11:54:50 CET 2007

The model here is just a penalised GLM, and the warnings relate to the GLM 
fitting process. Fitted probabilities of 0 or 1 can be perfectly appropriate, 
but do indicate that the linear predictor is not really uniquely defined, and 
that some care may be needed in interpreting results (for example, if the 
fitted probabilities are zero or one, then a CI for the corresponding linear 
predictor will depend more on the prior assumptions about smoothness than 
anything else). This problem is not really GAM specific, it relates to any 
`logistic regression' model. 

Similarly, the GLM fitting IRLS iterations are not guaranteed to converge, and 
can fail, especially for overly flexible logistic regression models. Try 
this, for example....

x <- 1:10
y <- c(0,0,0,0,0,1,1,1,1,1)
glm(y~x,family=binomial)

I get...
...
Warning messages:
1: In glm.fit(x = X, y = Y, weights = weights, start = start, etastart = 
etastart,  :
  algorithm did not converge
2: In glm.fit(x = X, y = Y, weights = weights, start = start, etastart = 
etastart,  :
  fitted probabilities numerically 0 or 1 occurred

...as models become more complex the scope for this sort of thing to happen 
increases, and some simplification may be appropriate. 

That said, mgcv::gam fitting with all smoothing parameters fixed, is slightly 
more likely to fail in this way than `glm' or `mgcv::gam' with some smoothing 
parameters  estimated, because of the steps taken to stabilise divergent fit 
iterations. When all smoothing parameters are fixed, mgcv uses older fitting 
routines that don't try as hard to stabilise a divergent fit as the newer 
fitting routines. This is a bit of an anomaly and I'll try and fix it for a 
future release. 

best,
Simon

On Monday 17 December 2007 11:53, zhijie zhang wrote:
> Dear Simon,
> Sorry for an incomplete listing of the question.
> #mgcv version is  1.3-29, R 2.6.1, windows XP
> #m.gam<-gam(mark~s(x)+s(y)+s(lstday2004)+s(ndvi2004)+s(slope)+s(elevation)+
>disbinary,family=binomial(logit),data=point) The above program's the core
> codes in my following loop programs.
>  It works well if i run the above codes only one time for my dataset, but
> warnings will occur if i run many times for the following loop.
>
> > while (j<1001) {
>
> +  index=sample(ID, replace=F)
> +  m.data$x=coords[index,]$x
> +  m.data$y=coords[index,]$y
> +  # For each permutation, we run the GAM using the optimal span for the
> above model m.gam
> +  s.gam
> <-gam(mark~s(x)+s(y)+s(lstday2004)+s(ndvi2004)+s(slope)+s(elevation)+disbin
>ary,,sp=c( 5.582647e-07,4.016504e-02,2.300424e-04,1.274065e+03,9.558236e-09,
> 1.868827e-08),family=binomial(logit),data=m.data)
> +  permresults[,i]=predict.gam(s.gam)
> +  i=i+1
> +  if (j%%100==0) print(i)
> +  j=j+1
> +  }
> [1] 101
> [1] 201
> [1] 301
> [1] 401
> [1] 501
> [1] 601
> [1] 701
> [1] 801
> [1] 901
> [1] 1001
> warnings() over 50
>
> > warnings()
>
> 1: In gam.fit(G, family = G$family, control = control, gamma = gamma,  ...
> : fitted probabilities numerically 0 or 1 occurred
> ......................................
> 14: In gam.fit(G, family = G$family, control = control, gamma = gamma,  ...
>
>   Algorithm did not converge
> ..........................
>
> On Dec 17, 2007 4:54 PM, Simon Wood <s.wood at bath.ac.uk> wrote:
> > What mgcv version are you running (and on what platform)?
> >
> > n Thursday 13 December 2007 17:46, zhijie zhang wrote:
> > > Dear all,
> > >  I run the GAMs (generalized additive models) in gam(mgcv) using the
> > > following codes.
> > >
> > > m.gam
> >
> > <-gam(mark~s(x)+s(y)+s(lstday2004)+s(ndvi2004)+s(slope)+s(elevation)+disb
> >in
> >
> > >ary,family=binomial(logit),data=point)
> > >
> > >  And two repeated warnings appeared.
> > > Warnings：
> > > 1: In gam.fit(G, family = G$family, control = control, gamma = gamma,
> >
> >  ...
> >
> > > : Algorithm did not converge
> > >
> > > 2: In gam.fit(G, family = G$family, control = control, gamma = gamma,
> >
> >  ...
> >
> > > : fitted probabilities numerically 0 or 1 occurred
> > >
> > > Q1: For warning1, could it be solved by changing the value of
> > > mgcv.toloptions for
> > > gam.control(mgcv.tol=1e-7)?
> > >
> > > Q1: For warning2, is there any impact for the results if the "fitted
> > > probabilities numerically 0 or 1 occurred" ?  How can i solve it?
> > >
> > >  I didn't try the possible solutions for them, because it took such a
> > > longer time to run the whole programs.
> > >  Could anybody suggest their solutions?
> > >  Any help or suggestions are greatly appreciated.
> > >   Thanks.
> >
> > --
> >
> > > Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> > > +44 1225 386603  www.maths.bath.ac.uk/~sw283

-- 
> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> +44 1225 386603  www.maths.bath.ac.uk/~sw283