[R] How to extract partial predictions, package mgcv

Wed Sep 16 19:52:54 CEST 2009

Yes, you're correct David, I used attach(f) earlier. Thanks for clarifying! I
should change my own code accordingly. 
/Staffan

David Winsemius wrote:
> 
> Your code minus the extraneous "lines"  was:
> 
> #plot the data
> plot(pp_s.x0.~x0,type="p",pch=2,cex=0.50)
> sequence=order(x0)
> lines(x0[sequence],pp_s.x0.[sequence])
> rug(jitter(x0))
> 
> My edition was:
> 
> plot(pp_s.x0.~x0,type="p",pch=2,cex=0.50, data=f)
> sequence=order(f$x0)
> lines(f$x0[sequence],f$pp_s.x0.[sequence])
> rug(jitter(f$x0))
> 
> So almost every one of the last set of plotting needed one or more  
> additions
> of "f$" or "data=f" in order to run without error. Perhaps you  
> attached "f"
> earlier and didn't include that code? I personally have given up using  
> attach()
> for just that reason.
> 
> -- 
> David.
> 
> 
> On Sep 16, 2009, at 1:31 PM, Staffan Roos wrote:
> 
>>
>> Hi David,
>>
>> Yes, the strange code "lines" was clearly a mistake from my side.  
>> Sorry.
>> What "dataframe references" did you add in your code?
>>
>> /Staffan
>>
>>
>>
>> David Winsemius wrote:
>>>
>>> But removing the extraneous second to last line that says just:
>>>
>>> lines
>>>
>>> ... would leave the console looking less puzzling.
>>>
>>> --  
>>> David
>>> On Sep 16, 2009, at 1:02 PM, David Winsemius wrote:
>>>
>>>> It is nice to see worked examples, but there were some errors that
>>>> relate to not including dataframe references. I've paste in code  
>>>> that
>>>> runs without error on my machine:
>>>>
>>>> On Sep 16, 2009, at 12:39 PM, Staffan Roos wrote:
>>>>
>>>>>
>>>>> Thanks Simon,
>>>>>
>>>>> I wasn't aware that "predict.gam" could be used in this  
>>>>> situation. I
>>>>> followed you advice and managed to extract the data I needed.
>>>>>
>>>>> For people interested, I add the code with comments I used here:
>>>>>
>>>>> #############################################
>>>>> # Full code for extracting partial predictions
>>>>> # Start the package mgcv
>>>>> library(mgcv)
>>>>>
>>>>> # Simulate some data (this is from Simon Wood's example in the
>>>>> # package "mgcv" manual)
>>>>> n<-200
>>>>> sig <- 2
>>>>> dat <- gamSim(1,n=n,scale=sig)
>>>>>
>>>>> # Check the data
>>>>> dat
>>>>>
>>>>> ## It looks alright :-)
>>>>>
>>>>> # Run the GAM
>>>>> b<-gam(y~s(x0)+s(I(x1^2))+s(x2)+offset(x3),data=dat)
>>>>>
>>>>> # Plot the partial predictions (You may need to rescale your window
>>>>> to
>>>>> # see the non-linearity
>>>>> par(mfrow=c(1,3))
>>>>> plot(b, scale=0)
>>>>>
>>>>> ### Clearly, the variables x0 and x2 are non-linear!
>>>>
>>>>> # I would like to extract the "solid line" in the graphs and the
>>>> # associated standard errors from the plots. Thus, I use "predict"
>>>> # and change to a data.frame:
>>>> c<-data.frame(predict(b,type="terms",se.fit=TRUE)$fit)
>>>> names(c)<-c("pp_s.x0.","pp_s.I.x1.2..","pp_s.x2.")
>>>>
>>>> d<-data.frame(predict(b,type="terms",se.fit=TRUE)$se.fit)
>>>> names(d)<-c("se_s.x0.","se_s.I.x1.2..","se_s.x2.")
>>>>
>>>> # Merge the three datasets:
>>>> f=cbind(dat,c,d)
>>>>
>>>> #Restrict the number of variables to the ones I am interested in:
>>>> f<-f[,c("x0","pp_s.x0.", "se_s.x0.")]
>>>> names(f)
>>>>
>>>> ### This data, i.e. "f", can now be exported or used for further
>>>> ### calculations within R
>>>>
>>>> #plot the data
>>>> par(mfrow=c(1,1))
>>>> plot(pp_s.x0.~x0,type="p",pch=2,cex=0.50, data=f)
>>>> sequence=order(f$x0)
>>>> lines(f$x0[sequence],f$pp_s.x0.[sequence])
>>>> rug(jitter(f$x0))
>>>>
>>>>> ##########################################
>>>>>
>>>>>
>>>>> Staffan,
>>>>>
>>>>> Take a look at ?predict.gam. You need to use type="terms" with
>>>>> se=TRUE to
>>>>> get
>>>>> the quantities that you need.
>>>>>
>>>>> Simon
>>>>>
>>>>> ps. with binary data,  method="REML" or method="ML" for the gam fit
>>>>> are
>>>>> often
>>>>> somewhat more reliable than  the default.
>>>>>
>>>>> On Monday 14 September 2009 19:30, Staffan Roos wrote:
>>>>>> Dear package mgcv users,
>>>>>>
>>>>>>
>>>>>>
>>>>>> I am using package mgcv to describe presence of a migratory bird
>>>>>> species
>>>>>> as
>>>>>> a function of several variables, including year, day number (i.e.
>>>>>> day-of-the-year), duration of survey, latitude and longitude.
>>>>>> Thus, the
>>>>>> "global model" is:
>>>>>>
>>>>>> global_model<-gam(present ~ as.factor(year) + s(dayno, k=5) +
>>>>>> s(duration,
>>>>>> k=5) + s(x, k=5) + s(y, k=5), family = binomial(link = logit),
>>>>>> data =
>>>>>> presence, gamma =1.4)
>>>>>>
>>>>>> The model works fine, suggesting that all the variables have  
>>>>>> strong,
>>>>>> non-linear, influences on the probability of detecting the
>>>>>> species. My
>>>>>> main
>>>>>> interest is in the effect "dayno" has on presence, given the
>>>>>> inclusion of
>>>>>> the other explanatory variables. Thus, I would like to extract the
>>>>>> values
>>>>>> of the partial prediction of "dayno" and its associated 2 standard
>>>>>> errors
>>>>>> above and below the main effect, as shown by the code
>>>>>> "plot(global_model)".
>>>>>>
>>>>>> I have tried to extract the values by using
>>>>>> "fitted.values(global_model,dayno)", but when plotted, the figure
>>>>>> doesn't
>>>>>> look like the partial prediction for "dayno". Instead, it gives me
>>>>>> a very
>>>>>> scattered figure ("shotgun effect").
>>>>>>
>>>>>> Has anyone tried to extract the partial predictions? If so, please
>>>>>> could
>>>>>> you advise me how to do this?
>>>>>>
>>>>
>>>>>> Staffan
>>>>>>
>>>>>> P.S.. For comparison, please have a look at Simon Woods paper in R
>>>>>> News,
>>>>>> 1(2):20-25, June 2001, especially the figures showing the partial
>>>>>> predictions of Mackerel egg densities. Using those graphs as an
>>>>>> example, I
>>>>>> would like to extract the partial predictions for e.g.
>>>>>> "s(b.depth)", given
>>>>>> the inclusion of the other variables.
>>>>>>
>>>>
>>>> David Winsemius, MD
>>>> Heritage Laboratories
>>>> West Hartford, CT
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> David Winsemius, MD
>>> Heritage Laboratories
>>> West Hartford, CT
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> -- 
>> View this message in context:
>> http://www.nabble.com/How-to-extract-partial-predictions%2C-package-mgcv-tp25441149p25477056.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-extract-partial-predictions%2C-package-mgcv-tp25441149p25477368.html
Sent from the R help mailing list archive at Nabble.com.