[R] Problem with simple random slope in gam and bam (mgcv package)

Simon Wood s.wood at bath.ac.uk
Wed Nov 9 15:09:22 CET 2011


Martijn,

Thanks for this. It's a bug. The p-value computation involves model 
matrices for each `smooth' term (in your case actually a random effect). 
When the data set is large, then random sub-sampling of the data is used 
to keep the computational cost of these model matrices down. This is ok 
for continuous predictors, but in the case of factor predictors, used in 
"re" terms, it can fail to pick up some levels of the factor and 
consequently fail due to rank deficiency.... This possibility had not 
previously occurred to me.

I'll work out a fix...

best,
Simon

On 09/11/11 12:41, Martijn Wieling wrote:
> Dear useRs,
>
> This is the first time I post to this list and I would appreciate any
> help available. I've used the excellent mgcv package for a while now
> to investigate geographical patterns of language variation, and it has
> has always worked without any problems for me. The problem below
> occurs using R 2.14.0 (both 32 and 64 bit versions in Windows and the
> 64 bit version in Unix) and mgcv (both version 1.7-10 and 1.7-6).
>
> In my (simplified) model predicting pronunciation distance I'd like to
> include a random slope per Participant for a binary value (IsDem)
> which stores a word-specific characteristic. I load the data
> (available at http://www.martijnwieling.nl/dat.csv) and run the model
> as follows:
>
>> library(mgcv) # version 1.7-10, but problem also occurs with earlier versions (e.g., 1.7-6)
>> dat = read.csv('dat.csv',header=T) # data available at: http://www.martijnwieling.nl/dat.csv
>> dim(dat) # the original dataset is larger, but the problem also occurs in this subset
> [1] 20000     4
>> model = bam(PronDist ~ s(Participant,IsDem,bs="re"), data=dat)
>> print(model) # works fine
>> summary(model, freq=T) # works fine
>> summary(model) # the Bayesian p-value estimation does not work:
> Error in eigen(B, symmetric = TRUE) : infinite or missing values in 'x'
>
> I obviously am interested in more complex models, but whenever I
> include any binary value as a by-word or by-participant random slope I
> get the same error. I've tried to locate the error and it appears to
> occur in the function pinvXVX in the block which 'deals with the
> fractional part of the pinv'.
>
> Any help would be appreciated!
>
> With kind regards,
>
> Martijn Wieling
> University of Groningen
> http://www.martijnwieling.nl
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603               http://people.bath.ac.uk/sw283



More information about the R-help mailing list