[R] How to extract partial predictions, package mgcv

Wed Sep 16 19:31:51 CEST 2009

Hi David, 

Yes, the strange code "lines" was clearly a mistake from my side. Sorry.
What "dataframe references" did you add in your code?

/Staffan

David Winsemius wrote:
> 
> But removing the extraneous second to last line that says just:
> 
> lines
> 
> ... would leave the console looking less puzzling.
> 
> --  
> David
> On Sep 16, 2009, at 1:02 PM, David Winsemius wrote:
> 
>> It is nice to see worked examples, but there were some errors that
>> relate to not including dataframe references. I've paste in code that
>> runs without error on my machine:
>>
>> On Sep 16, 2009, at 12:39 PM, Staffan Roos wrote:
>>
>>>
>>> Thanks Simon,
>>>
>>> I wasn't aware that "predict.gam" could be used in this situation. I
>>> followed you advice and managed to extract the data I needed.
>>>
>>> For people interested, I add the code with comments I used here:
>>>
>>> #############################################
>>> # Full code for extracting partial predictions
>>> # Start the package mgcv
>>> library(mgcv)
>>>
>>> # Simulate some data (this is from Simon Wood's example in the
>>> # package "mgcv" manual)
>>> n<-200
>>> sig <- 2
>>> dat <- gamSim(1,n=n,scale=sig)
>>>
>>> # Check the data
>>> dat
>>>
>>> ## It looks alright :-)
>>>
>>> # Run the GAM
>>> b<-gam(y~s(x0)+s(I(x1^2))+s(x2)+offset(x3),data=dat)
>>>
>>> # Plot the partial predictions (You may need to rescale your window  
>>> to
>>> # see the non-linearity
>>> par(mfrow=c(1,3))
>>> plot(b, scale=0)
>>>
>>> ### Clearly, the variables x0 and x2 are non-linear!
>>
>>> # I would like to extract the "solid line" in the graphs and the
>> # associated standard errors from the plots. Thus, I use "predict"
>> # and change to a data.frame:
>> c<-data.frame(predict(b,type="terms",se.fit=TRUE)$fit)
>> names(c)<-c("pp_s.x0.","pp_s.I.x1.2..","pp_s.x2.")
>>
>> d<-data.frame(predict(b,type="terms",se.fit=TRUE)$se.fit)
>> names(d)<-c("se_s.x0.","se_s.I.x1.2..","se_s.x2.")
>>
>> # Merge the three datasets:
>> f=cbind(dat,c,d)
>>
>> #Restrict the number of variables to the ones I am interested in:
>> f<-f[,c("x0","pp_s.x0.", "se_s.x0.")]
>> names(f)
>>
>> ### This data, i.e. "f", can now be exported or used for further
>> ### calculations within R
>>
>> #plot the data
>> par(mfrow=c(1,1))
>> plot(pp_s.x0.~x0,type="p",pch=2,cex=0.50, data=f)
>> sequence=order(f$x0)
>> lines(f$x0[sequence],f$pp_s.x0.[sequence])
>> rug(jitter(f$x0))
>>
>>> ##########################################
>>>
>>>
>>> Staffan,
>>>
>>> Take a look at ?predict.gam. You need to use type="terms" with  
>>> se=TRUE to
>>> get
>>> the quantities that you need.
>>>
>>> Simon
>>>
>>> ps. with binary data,  method="REML" or method="ML" for the gam fit  
>>> are
>>> often
>>> somewhat more reliable than  the default.
>>>
>>> On Monday 14 September 2009 19:30, Staffan Roos wrote:
>>>> Dear package mgcv users,
>>>>
>>>>
>>>>
>>>> I am using package mgcv to describe presence of a migratory bird  
>>>> species
>>>> as
>>>> a function of several variables, including year, day number (i.e.
>>>> day-of-the-year), duration of survey, latitude and longitude.  
>>>> Thus, the
>>>> "global model" is:
>>>>
>>>> global_model<-gam(present ~ as.factor(year) + s(dayno, k=5) +  
>>>> s(duration,
>>>> k=5) + s(x, k=5) + s(y, k=5), family = binomial(link = logit),  
>>>> data =
>>>> presence, gamma =1.4)
>>>>
>>>> The model works fine, suggesting that all the variables have strong,
>>>> non-linear, influences on the probability of detecting the  
>>>> species. My
>>>> main
>>>> interest is in the effect "dayno" has on presence, given the  
>>>> inclusion of
>>>> the other explanatory variables. Thus, I would like to extract the  
>>>> values
>>>> of the partial prediction of "dayno" and its associated 2 standard  
>>>> errors
>>>> above and below the main effect, as shown by the code
>>>> "plot(global_model)".
>>>>
>>>> I have tried to extract the values by using
>>>> "fitted.values(global_model,dayno)", but when plotted, the figure  
>>>> doesn't
>>>> look like the partial prediction for "dayno". Instead, it gives me  
>>>> a very
>>>> scattered figure ("shotgun effect").
>>>>
>>>> Has anyone tried to extract the partial predictions? If so, please  
>>>> could
>>>> you advise me how to do this?
>>>>
>>
>>>> Staffan
>>>>
>>>> P.S.. For comparison, please have a look at Simon Woods paper in R  
>>>> News,
>>>> 1(2):20-25, June 2001, especially the figures showing the partial
>>>> predictions of Mackerel egg densities. Using those graphs as an  
>>>> example, I
>>>> would like to extract the partial predictions for e.g.  
>>>> "s(b.depth)", given
>>>> the inclusion of the other variables.
>>>>
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-extract-partial-predictions%2C-package-mgcv-tp25441149p25477056.html
Sent from the R help mailing list archive at Nabble.com.