[R] How to extract partial predictions, package mgcv
Staffan Roos
staffan.roos at bto.org
Wed Sep 16 19:31:51 CEST 2009
Hi David,
Yes, the strange code "lines" was clearly a mistake from my side. Sorry.
What "dataframe references" did you add in your code?
/Staffan
David Winsemius wrote:
>
> But removing the extraneous second to last line that says just:
>
> lines
>
> ... would leave the console looking less puzzling.
>
> --
> David
> On Sep 16, 2009, at 1:02 PM, David Winsemius wrote:
>
>> It is nice to see worked examples, but there were some errors that
>> relate to not including dataframe references. I've paste in code that
>> runs without error on my machine:
>>
>> On Sep 16, 2009, at 12:39 PM, Staffan Roos wrote:
>>
>>>
>>> Thanks Simon,
>>>
>>> I wasn't aware that "predict.gam" could be used in this situation. I
>>> followed you advice and managed to extract the data I needed.
>>>
>>> For people interested, I add the code with comments I used here:
>>>
>>> #############################################
>>> # Full code for extracting partial predictions
>>> # Start the package mgcv
>>> library(mgcv)
>>>
>>> # Simulate some data (this is from Simon Wood's example in the
>>> # package "mgcv" manual)
>>> n<-200
>>> sig <- 2
>>> dat <- gamSim(1,n=n,scale=sig)
>>>
>>> # Check the data
>>> dat
>>>
>>> ## It looks alright :-)
>>>
>>> # Run the GAM
>>> b<-gam(y~s(x0)+s(I(x1^2))+s(x2)+offset(x3),data=dat)
>>>
>>> # Plot the partial predictions (You may need to rescale your window
>>> to
>>> # see the non-linearity
>>> par(mfrow=c(1,3))
>>> plot(b, scale=0)
>>>
>>> ### Clearly, the variables x0 and x2 are non-linear!
>>
>>> # I would like to extract the "solid line" in the graphs and the
>> # associated standard errors from the plots. Thus, I use "predict"
>> # and change to a data.frame:
>> c<-data.frame(predict(b,type="terms",se.fit=TRUE)$fit)
>> names(c)<-c("pp_s.x0.","pp_s.I.x1.2..","pp_s.x2.")
>>
>> d<-data.frame(predict(b,type="terms",se.fit=TRUE)$se.fit)
>> names(d)<-c("se_s.x0.","se_s.I.x1.2..","se_s.x2.")
>>
>> # Merge the three datasets:
>> f=cbind(dat,c,d)
>>
>> #Restrict the number of variables to the ones I am interested in:
>> f<-f[,c("x0","pp_s.x0.", "se_s.x0.")]
>> names(f)
>>
>> ### This data, i.e. "f", can now be exported or used for further
>> ### calculations within R
>>
>> #plot the data
>> par(mfrow=c(1,1))
>> plot(pp_s.x0.~x0,type="p",pch=2,cex=0.50, data=f)
>> sequence=order(f$x0)
>> lines(f$x0[sequence],f$pp_s.x0.[sequence])
>> rug(jitter(f$x0))
>>
>>> ##########################################
>>>
>>>
>>> Staffan,
>>>
>>> Take a look at ?predict.gam. You need to use type="terms" with
>>> se=TRUE to
>>> get
>>> the quantities that you need.
>>>
>>> Simon
>>>
>>> ps. with binary data, method="REML" or method="ML" for the gam fit
>>> are
>>> often
>>> somewhat more reliable than the default.
>>>
>>> On Monday 14 September 2009 19:30, Staffan Roos wrote:
>>>> Dear package mgcv users,
>>>>
>>>>
>>>>
>>>> I am using package mgcv to describe presence of a migratory bird
>>>> species
>>>> as
>>>> a function of several variables, including year, day number (i.e.
>>>> day-of-the-year), duration of survey, latitude and longitude.
>>>> Thus, the
>>>> "global model" is:
>>>>
>>>> global_model<-gam(present ~ as.factor(year) + s(dayno, k=5) +
>>>> s(duration,
>>>> k=5) + s(x, k=5) + s(y, k=5), family = binomial(link = logit),
>>>> data =
>>>> presence, gamma =1.4)
>>>>
>>>> The model works fine, suggesting that all the variables have strong,
>>>> non-linear, influences on the probability of detecting the
>>>> species. My
>>>> main
>>>> interest is in the effect "dayno" has on presence, given the
>>>> inclusion of
>>>> the other explanatory variables. Thus, I would like to extract the
>>>> values
>>>> of the partial prediction of "dayno" and its associated 2 standard
>>>> errors
>>>> above and below the main effect, as shown by the code
>>>> "plot(global_model)".
>>>>
>>>> I have tried to extract the values by using
>>>> "fitted.values(global_model,dayno)", but when plotted, the figure
>>>> doesn't
>>>> look like the partial prediction for "dayno". Instead, it gives me
>>>> a very
>>>> scattered figure ("shotgun effect").
>>>>
>>>> Has anyone tried to extract the partial predictions? If so, please
>>>> could
>>>> you advise me how to do this?
>>>>
>>
>>>> Staffan
>>>>
>>>> P.S.. For comparison, please have a look at Simon Woods paper in R
>>>> News,
>>>> 1(2):20-25, June 2001, especially the figures showing the partial
>>>> predictions of Mackerel egg densities. Using those graphs as an
>>>> example, I
>>>> would like to extract the partial predictions for e.g.
>>>> "s(b.depth)", given
>>>> the inclusion of the other variables.
>>>>
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
View this message in context: http://www.nabble.com/How-to-extract-partial-predictions%2C-package-mgcv-tp25441149p25477056.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list