[R] How to extract partial predictions, package mgcv

David Winsemius dwinsemius at comcast.net
Wed Sep 16 19:02:11 CEST 2009


It is nice to see worked examples, but there were some errors that
relate to not including dataframe references. I've paste in code that
runs without error on my machine:

On Sep 16, 2009, at 12:39 PM, Staffan Roos wrote:

>
> Thanks Simon,
>
> I wasn't aware that "predict.gam" could be used in this situation. I
> followed you advice and managed to extract the data I needed.
>
> For people interested, I add the code with comments I used here:
>
> #############################################
> # Full code for extracting partial predictions
> # Start the package mgcv
> library(mgcv)
>
> # Simulate some data (this is from Simon Wood's example in the
> # package "mgcv" manual)
> n<-200
> sig <- 2
> dat <- gamSim(1,n=n,scale=sig)
>
> # Check the data
> dat
>
> ## It looks alright :-)
>
> # Run the GAM
> b<-gam(y~s(x0)+s(I(x1^2))+s(x2)+offset(x3),data=dat)
>
> # Plot the partial predictions (You may need to rescale your window to
> # see the non-linearity
> par(mfrow=c(1,3))
> plot(b, scale=0)
>
> ### Clearly, the variables x0 and x2 are non-linear!

> # I would like to extract the "solid line" in the graphs and the
# associated standard errors from the plots. Thus, I use "predict"
# and change to a data.frame:
c<-data.frame(predict(b,type="terms",se.fit=TRUE)$fit)
names(c)<-c("pp_s.x0.","pp_s.I.x1.2..","pp_s.x2.")

d<-data.frame(predict(b,type="terms",se.fit=TRUE)$se.fit)
names(d)<-c("se_s.x0.","se_s.I.x1.2..","se_s.x2.")

# Merge the three datasets:
f=cbind(dat,c,d)

#Restrict the number of variables to the ones I am interested in:
f<-f[,c("x0","pp_s.x0.", "se_s.x0.")]
names(f)

### This data, i.e. "f", can now be exported or used for further
### calculations within R

#plot the data
par(mfrow=c(1,1))
plot(pp_s.x0.~x0,type="p",pch=2,cex=0.50, data=f)
sequence=order(f$x0)
lines(f$x0[sequence],f$pp_s.x0.[sequence])
lines
rug(jitter(f$x0))

> ##########################################
>
>
> Staffan,
>
> Take a look at ?predict.gam. You need to use type="terms" with  
> se=TRUE to
> get
> the quantities that you need.
>
> Simon
>
> ps. with binary data,  method="REML" or method="ML" for the gam fit  
> are
> often
> somewhat more reliable than  the default.
>
> On Monday 14 September 2009 19:30, Staffan Roos wrote:
>> Dear package mgcv users,
>>
>>
>>
>> I am using package mgcv to describe presence of a migratory bird  
>> species
>> as
>> a function of several variables, including year, day number (i.e.
>> day-of-the-year), duration of survey, latitude and longitude. Thus,  
>> the
>> "global model" is:
>>
>> global_model<-gam(present ~ as.factor(year) + s(dayno, k=5) +  
>> s(duration,
>> k=5) + s(x, k=5) + s(y, k=5), family = binomial(link = logit), data =
>> presence, gamma =1.4)
>>
>> The model works fine, suggesting that all the variables have strong,
>> non-linear, influences on the probability of detecting the species.  
>> My
>> main
>> interest is in the effect "dayno" has on presence, given the  
>> inclusion of
>> the other explanatory variables. Thus, I would like to extract the  
>> values
>> of the partial prediction of "dayno" and its associated 2 standard  
>> errors
>> above and below the main effect, as shown by the code
>> "plot(global_model)".
>>
>> I have tried to extract the values by using
>> "fitted.values(global_model,dayno)", but when plotted, the figure  
>> doesn't
>> look like the partial prediction for "dayno". Instead, it gives me  
>> a very
>> scattered figure ("shotgun effect").
>>
>> Has anyone tried to extract the partial predictions? If so, please  
>> could
>> you advise me how to do this?
>>

>> Staffan
>>
>> P.S.. For comparison, please have a look at Simon Woods paper in R  
>> News,
>> 1(2):20-25, June 2001, especially the figures showing the partial
>> predictions of Mackerel egg densities. Using those graphs as an  
>> example, I
>> would like to extract the partial predictions for e.g.  
>> "s(b.depth)", given
>> the inclusion of the other variables.
>>

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list