[R] Two repeated warnings when running gam(mgcv) to analyse my dataset?
Simon Wood
s.wood at bath.ac.uk
Tue Dec 18 11:54:50 CET 2007
The model here is just a penalised GLM, and the warnings relate to the GLM
fitting process. Fitted probabilities of 0 or 1 can be perfectly appropriate,
but do indicate that the linear predictor is not really uniquely defined, and
that some care may be needed in interpreting results (for example, if the
fitted probabilities are zero or one, then a CI for the corresponding linear
predictor will depend more on the prior assumptions about smoothness than
anything else). This problem is not really GAM specific, it relates to any
`logistic regression' model.
Similarly, the GLM fitting IRLS iterations are not guaranteed to converge, and
can fail, especially for overly flexible logistic regression models. Try
this, for example....
x <- 1:10
y <- c(0,0,0,0,0,1,1,1,1,1)
glm(y~x,family=binomial)
I get...
...
Warning messages:
1: In glm.fit(x = X, y = Y, weights = weights, start = start, etastart =
etastart, :
algorithm did not converge
2: In glm.fit(x = X, y = Y, weights = weights, start = start, etastart =
etastart, :
fitted probabilities numerically 0 or 1 occurred
...as models become more complex the scope for this sort of thing to happen
increases, and some simplification may be appropriate.
That said, mgcv::gam fitting with all smoothing parameters fixed, is slightly
more likely to fail in this way than `glm' or `mgcv::gam' with some smoothing
parameters estimated, because of the steps taken to stabilise divergent fit
iterations. When all smoothing parameters are fixed, mgcv uses older fitting
routines that don't try as hard to stabilise a divergent fit as the newer
fitting routines. This is a bit of an anomaly and I'll try and fix it for a
future release.
best,
Simon
On Monday 17 December 2007 11:53, zhijie zhang wrote:
> Dear Simon,
> Sorry for an incomplete listing of the question.
> #mgcv version is 1.3-29, R 2.6.1, windows XP
> #m.gam<-gam(mark~s(x)+s(y)+s(lstday2004)+s(ndvi2004)+s(slope)+s(elevation)+
>disbinary,family=binomial(logit),data=point) The above program's the core
> codes in my following loop programs.
> It works well if i run the above codes only one time for my dataset, but
> warnings will occur if i run many times for the following loop.
>
> > while (j<1001) {
>
> + index=sample(ID, replace=F)
> + m.data$x=coords[index,]$x
> + m.data$y=coords[index,]$y
> + # For each permutation, we run the GAM using the optimal span for the
> above model m.gam
> + s.gam
> <-gam(mark~s(x)+s(y)+s(lstday2004)+s(ndvi2004)+s(slope)+s(elevation)+disbin
>ary,,sp=c( 5.582647e-07,4.016504e-02,2.300424e-04,1.274065e+03,9.558236e-09,
> 1.868827e-08),family=binomial(logit),data=m.data)
> + permresults[,i]=predict.gam(s.gam)
> + i=i+1
> + if (j%%100==0) print(i)
> + j=j+1
> + }
> [1] 101
> [1] 201
> [1] 301
> [1] 401
> [1] 501
> [1] 601
> [1] 701
> [1] 801
> [1] 901
> [1] 1001
> warnings() over 50
>
> > warnings()
>
> 1: In gam.fit(G, family = G$family, control = control, gamma = gamma, ...
> : fitted probabilities numerically 0 or 1 occurred
> ......................................
> 14: In gam.fit(G, family = G$family, control = control, gamma = gamma, ...
>
> Algorithm did not converge
> ..........................
>
> On Dec 17, 2007 4:54 PM, Simon Wood <s.wood at bath.ac.uk> wrote:
> > What mgcv version are you running (and on what platform)?
> >
> > n Thursday 13 December 2007 17:46, zhijie zhang wrote:
> > > Dear all,
> > > I run the GAMs (generalized additive models) in gam(mgcv) using the
> > > following codes.
> > >
> > > m.gam
> >
> > <-gam(mark~s(x)+s(y)+s(lstday2004)+s(ndvi2004)+s(slope)+s(elevation)+disb
> >in
> >
> > >ary,family=binomial(logit),data=point)
> > >
> > > And two repeated warnings appeared.
> > > Warnings:
> > > 1: In gam.fit(G, family = G$family, control = control, gamma = gamma,
> >
> > ...
> >
> > > : Algorithm did not converge
> > >
> > > 2: In gam.fit(G, family = G$family, control = control, gamma = gamma,
> >
> > ...
> >
> > > : fitted probabilities numerically 0 or 1 occurred
> > >
> > > Q1: For warning1, could it be solved by changing the value of
> > > mgcv.toloptions for
> > > gam.control(mgcv.tol=1e-7)?
> > >
> > > Q1: For warning2, is there any impact for the results if the "fitted
> > > probabilities numerically 0 or 1 occurred" ? How can i solve it?
> > >
> > > I didn't try the possible solutions for them, because it took such a
> > > longer time to run the whole programs.
> > > Could anybody suggest their solutions?
> > > Any help or suggestions are greatly appreciated.
> > > Thanks.
> >
> > --
> >
> > > Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> > > +44 1225 386603 www.maths.bath.ac.uk/~sw283
--
> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> +44 1225 386603 www.maths.bath.ac.uk/~sw283
More information about the R-help
mailing list