[R-sig-Geo] Predict a gam model with factors to a raster

Wed Nov 12 17:51:45 CET 2014

Hi,

I edited the same question to a better understanding.

I hope someone can help with the issue about a prediction using a model with factors in package ‘raster'.

I would like to do the same that shows this code  from BRT vignette:

    ####Example BRT

    library(dismo)
    data(Anguilla_grids)
    angaus.tc5.lr005 <- gbm.step(data=Anguilla_train, gbm.x = 3:13, gbm.y = 2,family = "bernoulli", tree.complexity = 5, learning.rate = 0.005, bag.fraction = 0.5)

    Method <- factor('electric', levels = levels(Anguilla_train$Method)) 
    add <- data.frame(Method)
    str(add)

    p <- predict(Anguilla_grids, angaus.tc5.lr005, const=add, n.trees=angaus.tc5.lr005$gbm.call$best.trees, type="response") 
    p <- mask(p, raster(Anguilla_grids, 1))
    plot(p, main='Angaus - BRT prediction’)

    #####

The code above uses in ‘predict ( )’ an argument “const” to handle with predictors that I have no rasters, something like a method of capture.

As described in the package raster, "const" is used as a constant for which there is no Raster object for model predictions. In my case categorical variable.

Below I have a reproducible example code to illustrate my problem with GAM and GLM models with categorical predictors.

I think this could be primary mistake in R coding, but I did not find the error by myself.

Thank you.

Thiago 

    ##### Problem with GAM and GLM
    library(mgcv)
    library(raster)
    library(rgdal)

    #raster layer
    v1rst<-raster()
    values(v1rst) <- 1:ncell(v1rst)
    names(v1rst)<-'v1'
    plot(v1rst)

    # Simple example of response variable and predictors
    y<-c(1,33,500,700, 334,320, 703,303,3030,3002,200,0,100,100,169)
    v1<-c(12,33,544,600, 34,30, 03,3390,3030,302,20,108,170,101,2009)
    v2<-c('t','t','t','t','t','t','t','t','c','c','c','c','c','c','c' )
    df<-data.frame(y, v1, v2)

    #GAM model with factor
    gam1<-gam(y~s(v1)+factor(v2), data=df)
    summary(gam1)

    #GAM model without factor
    gam2<-gam(y~s(v1), data=df)
    summary(gam2)

    #GLM with factor
    glm1<-glm(y~v1 + factor(v2), data=df)
    summary(glm1)

    #GLM without factor
    glm2<-glm(y~v1, data=df)
    summary(glm2)

    # data.frame with a constant value 
    #(of class ’factor’) to pass that on to the predict function.
    v2<-factor( 't',levels=levels(df$v2)) 
    add2<-data.frame(v2)
    str(add2)              

    #Prediction GAM with factor
    p<-predict(v1rst,gam1, const=add2, type='response')

    #This is the error 'Error in `[.data.frame`(blockvals, , f[j]) : undefined columns selected

    #Prediction without factor
    p<-predict(v1rst,gam2, type="response")
    plot(p) ## ok!

    #Prediction glm
    #with factor
    glm1p<-predict(v1rst, glm1, type='response', const=add2)
    #error

    #Prediction without factor
    glm2p<-predict(v1rst, glm2, type='response')
    plot(glm2p) ##ok!!