[R] Strange error returned or bug in gam in mgcv????

Gavin Simpson gavin.simpson at ucl.ac.uk
Wed Sep 2 12:27:39 CEST 2009


On Wed, 2009-09-02 at 09:26 +0100, Corrado wrote:
> Dear Gavin, Simon,
> 
> this is the result of str:
> 
> > str(dist_scot24_vector_with_climate)
> 'data.frame':   2265025 obs. of  14 variables:
>  $ X       : int  1 2 3 4 5 6 7 8 9 10 ...
>  $ tetrad_i: Factor w/ 1505 levels "HP61A","HP61I",..: 1505 1504 1503 1502 
> 1501 1500 1499 1498 1497 1496 ...
>  $ tetrad_j: Factor w/ 1505 levels "HP61A","HP61I",..: 1505 1505 1505 1505 
> 1505 1505 1505 1505 1505 1505 ...
>  $ bray    : num  0 0.566 0.251 0.407 0.45 ...
>  $ PC1     : num  -3.97 -3.14 -7.27 -5.77 -5.88 ...
>  $ PC2     : num  3.26 2.87 3.19 2.96 2.97 ...
>  $ PC3     : num  -0.16511 -0.28601 -0.00362 -0.11685 -0.09695 ...
>  $ PC4     : num  -0.629 -0.696 -0.6 -0.683 -0.639 ...
>  $ PC5     : num  0.2603 0.3818 -0.0148 0.0967 0.094 ...
>  $ PC6   : num  -3.97 -3.97 -3.97 -3.97 -3.97 ...
>  $ PC7   : num  3.26 3.26 3.26 3.26 3.26 ...
>  $ PC8   : num  -0.165 -0.165 -0.165 -0.165 -0.165 ...
>  $ PC9   : num  -0.629 -0.629 -0.629 -0.629 -0.629 ...
>  $ PC10   : num  0.26 0.26 0.26 0.26 0.26 ...
> >
> 
> It looks ok to me. What do you think?

Doesn't appear to be any problem there.

In a separate email of yours I recall you stating you were using 1.4.1
(?). If so, you should upgrade mgcv  to the latest version and try your
simple models again to see if that solves your problem.

I tried to fit a model of the same size as your problem with mgcv 1.5.1
but I ran out of memory on my home desktop (with 4GB of RAM), but I
didn't get an error until it started swapping to disk and I had to kill
R. That was after an hour of processing.

I tried with this dummy data set:

> require(mgcv)
Loading required package: mgcv
This is mgcv  1.5-5 . For overview type `help("mgcv-package")'.
> dat <- data.frame(matrix(rnorm(2265025 * 11), ncol = 11))
> names(dat)
 [1] "X1"  "X2"  "X3"  "X4"  "X5"  "X6"  "X7"  "X8"  "X9"  "X10" "X11"
> mod <- gam(X1 ~ s(X2) + s(X3) + s(X4) + s(X5) + s(X6) + s(X7) + s(X8)
+ s(X9) + s(X10) + s(X11), data = dat)

Are you getting the error fairly quickly once you try to fit the model?

HTH

G

> 
> On Tuesday 01 September 2009 18:43:24 Gavin Simpson wrote:
> > On Tue, 2009-09-01 at 17:55 +0100, Corrado wrote:
> > > Dear Simon,
> > >
> > > I have stored all information at the link:
> > >
> > > http://scsys.co.uk:8002/33309?hl=on&submit=Format+it!
> >
> > You could have included that in your mail to the list - it is just plain
> > text after all.
> >
> > > I have the same problem if I do
> > > s(PC1)  + ..... + s(PC10) or
> > > s(Pc1,PC2,PC3,PC4,PC5)+s(PC6,PC7,PC8,PC9,PC10) or
> > > s(PC1,PC2,PC3,PC6,PC7,PC8) .....
> > >
> > > I have renamed PC1.1,PC2.1,PC3.1,PC4.1,PC5.1 to PC6,PC7,PC8,PC9,PC10 for
> > > simplicity.
> >
> > What does
> >
> > str(dist_scot24_vector_with_climate)
> >
> > show? I seem to recall getting similar errors when I'd done something
> > silly in a data prep routine and had data in a data frame that wasn't
> > numeric but looked like it was - a factor for example.
> >
> > If you can't do some quite simple things like the first of your three
> > alternatives above, that suggests something amiss with the data. That'd
> > be the first thing to check.
> >
> > HTH
> >
> > G
> >
> > > Regards
> > >
> > > On Tuesday 01 September 2009 17:31:04 Simon Wood wrote:
> > > > The basic problem is that you have requested a 10 dimensional thin
> > > > plate spline, with a basis dimension of 196830. In reality it will not
> > > > be possible to compute this, even if you have more than 196830 data. In
> > > > any case it would be unlikely to provide a very useful model --- the
> > > > "simplest" function that it can theoretically represent will have 3003
> > > > degrees of freedom.
> > > >
> > > > That said the error message is obviously rather unhelpful... Can you
> > > > tell me how many data you are actually trying to fit, and I'll try and
> > > > track down exactly where it's failing, and put in a more informative
> > > > message.
> > > >
> > > > best,
> > > > Simon
> > > >
> > > > On Tuesday 01 September 2009 14:51, Corrado wrote:
> > > > > Dear friends,
> > > > >
> > > > > what is this error message in gam???? I cannot understand what it
> > > > > means .... is it a bug?
> > > > >
> > > > > gam_bray_scot24_pc_0505<gam(bray~s(PC1,PC2,PC3,PC4,PC5,
> > > > > PC1.1,PC2.1,PC3.1,PC4.1,PC5.1),data=dist_scot24_vector_with_climate)
> > > > >
> > > > > Error in if (length(data) != vl) { :
> > > > >   missing value where TRUE/FALSE needed
> > > > > Calls: gam ... smooth.construct -> smooth.construct.tp.smooth.spec ->
> > > > > array In addition: Warning message:
> > > > > In array(0, n * k) : NAs introduced by coercion
> > > > > Execution halted
> > > > >
> > > > > Thanks in advance,
> > > > >
> > > > > Best regards
> 
> 
> 
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%




More information about the R-help mailing list