[R] package mgcv - predict with bam: Error in X[ind, ] : subscript out of bounds
Katharina May
may.katharina at googlemail.com
Mon Feb 3 12:59:57 CET 2014
Hi Simon,
many thanks for looking into this and making me understand the problem!
I'll adjust my factor levels right away...
Best, Katharina
On 3 February 2014 12:42, Simon Wood <s.wood at bath.ac.uk> wrote:
> Hi Katharina,
>
> Thanks for sending this.
>
> The problem is that the prediction data for site contain levels not
> available in the (useable non-NA) fit data...
>
>> levels(m$model$site)
> [1] "KRB" "NP.FOR" "WKS.FRE" "WKS.KRE" "WKS.RIE" "WKS.WUE"
>> levels(gapData$site)
> [1] "KRB" "NP.FOR" "RIE.2" "WKS.BBR" "WKS.FRE" "WKS.HOE" "WKS.KRE"
> [8] "WKS.RIE" "WKS.WUE"
>
> predict.lm has a check for this, and so fails with a rather more informative
> error message. e.g.
>
> m0 <- lm(sensor1 ~ sensor2 + site + site:NthSampling,
> data=xylemRohWeekXnn2011,na.action=na.omit)
> predict(m0,gapData)
> ... factor site has new levels RIE.2, WKS.BBR
>
> I'll add a better check to predict.gam.
>
> best,
> Simon
>
> ps. if you want predictions with the random effects for site set to zero
> then one trick is to use terms like s(site,bs="re",by=dum) in fitting with
> dum set to 1. Then in prediction you can set 'site' to any existing level,
> and dum to zero, in order to get a prediction for the missing level, with
> the 'site' effect set to zero.
>
>
>
> On 02/02/14 17:52, Katharina May wrote:
>>
>> Hi Simon,
>>
>> thank you for your reply, I really appreciate any help to understand
>> the problem here...
>> Unluckily the package upgrade didn't help with this issue.
>> An example reproducing the error, and a current sessionInfo() Output
>> can be found below.
>>
>> Many thanks once again,
>>
>> Katharina
>>
>>
>> R Code Example
>> <snip>
>> library(RCurl)
>> library(mgcv)
>> #retrieve xylemRohWeekXnn2011 test data frame
>> eval( expr = parse( text =
>>
>> getURL("https://webdisk.ads.mwn.de/Handlers/AnonymousDownload.ashx?folder=1a7cbaa4&path=xylemRohWeekXnn2011.R")
>> ))
>>
>> xylemRohWeekXnn.fit.bam <- bam(sensor1 ~ sensor2 + s(site, bs="re")
>> + s(site, NthSampling, bs="re") , data=xylemRohWeekXnn2011,
>> na.action=na.omit)
>>
>> #subset data containing gaps for predicting
>> gapData <- xylemRohWeekXnn2011[is.na(xylemRohWeekXnn2011[,2]) &
>> !is.na(xylemRohWeekXnn2011[,11]),c(2:3,6:7, 11)]
>>
>> xylemRohWeekXnnSite.fit <-
>> predict.gam(xylemRohWeekXnn.fit.bam,gapData, type="response", se=F)
>> </snap>
>>
>>
>>
>> My current Session Information (sessionInfo() Output - also confirming
>> that the problem exists on both Windows and Mac OS X):
>> <snip>
>> R version 3.0.2 (2013-09-25)
>> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>>
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base
>>
>> other attached packages:
>> [1] mgcv_1.7-28 nlme_3.1-113 RCurl_1.95-4.1 bitops_1.0-6
>>
>> loaded via a namespace (and not attached):
>> [1] grid_3.0.2 lattice_0.20-24 Matrix_1.1-2 tools_3.0.2
>> </snap>
>>
>>
>>
>>
>> On 31/01/14 12:57, Simon Wood wrote:
>>>
>>>
>>> Hi Katharina,
>>>
>>> Could you try upgrading to mgcv_1.7-28, please? There was an occasional
>>> problem to do with matching factor levels, which is fixed, but I'm not
>>> very confident that is what is going on.
>>>
>>> If upgrading doesn't work, is there any chance you could send me a small
>>> example dataset and code that produces the error, and I'll look at it?
>>>
>>> best,
>>> Simon
>>>
>>> --
>>> Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
>>> +44 (0)1225 386603 http://people.bath.ac.uk/sw283
>
>
>
> --
> Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
> +44 (0)1225 386603 http://people.bath.ac.uk/sw283
--
Katharina May
BSc. Forest Science and Resource Management
IT Specialist (CCI)
Prinz-Ludwig-Str. 7
85354 Freising
Germany
Mobile: +49-176-24031809
www: m3y.de
More information about the R-help
mailing list