predict.bam {mgcv} | R Documentation |
Prediction from fitted Big Additive Model model
Description
In most cases essentially a wrapper for predict.gam
for prediction from a
model fitted by bam
. Can compute on a parallel cluster. For models fitted using discrete
methods with discrete=TRUE
then discrete prediction methods are used instead.
Takes a fitted bam
object produced by bam
and produces predictions given a new set of values for the model covariates
or the original values used for the model fit. Predictions can be accompanied
by standard errors, based on the posterior distribution of the model
coefficients. The routine can optionally return the matrix by which the model
coefficients must be pre-multiplied in order to yield the values of the linear predictor at
the supplied covariate values: this is useful for obtaining credible regions
for quantities derived from the model (e.g. derivatives of smooths), and for lookup table prediction outside
R
.
Usage
## S3 method for class 'bam'
predict(object,newdata,type="link",se.fit=FALSE,terms=NULL,
exclude=NULL,block.size=50000,newdata.guaranteed=FALSE,
na.action=na.pass,cluster=NULL,discrete=TRUE,n.threads=1,gc.level=0,...)
Arguments
object |
a fitted |
newdata |
A data frame or list containing the values of the model covariates at which predictions
are required. If this is not provided then predictions corresponding to the
original data are returned. If |
type |
When this has the value |
se.fit |
when this is TRUE (not default) standard error estimates are returned for each prediction. |
terms |
if |
exclude |
if |
block.size |
maximum number of predictions to process per call to underlying code: larger is quicker, but more memory intensive. |
newdata.guaranteed |
Set to |
na.action |
what to do about |
cluster |
|
discrete |
if |
n.threads |
if |
gc.level |
increase from 0 to up the level of garbage collection if default does not give enough. |
... |
other arguments. |
Details
The standard errors produced by predict.gam
are based on the
Bayesian posterior covariance matrix of the parameters Vp
in the fitted
bam object.
To facilitate plotting with termplot
, if object
possesses
an attribute "para.only"
and type=="terms"
then only parametric
terms of order 1 are returned (i.e. those that termplot
can handle).
Note that, in common with other prediction functions, any offset supplied to
bam
as an argument is always ignored when predicting, unlike
offsets specified in the bam model formula.
See the examples in predict.gam
for how to use the lpmatrix
for obtaining credible
regions for quantities derived from the model.
When discrete=TRUE
the prediction data in newdata
is discretized in the same way as is done when using discrete fitting methods with bam
. However the discretization grids are not currently identical to those used during fitting. Instead, discretization is done afresh for the prediction data. This means that if you are predicting for a relatively small set of prediction data, or on a regular grid, then the results may in fact be identical to those obtained without discretization. The disadvantage to this approach is that if you make predictions with a large data frame, and then split it into smaller data frames to make the predictions again, the results may differ slightly, because of slightly different discretization errors.
Value
If type=="lpmatrix"
then a matrix is returned which will
give a vector of linear predictor values (minus any offest) at the supplied covariate
values, when applied to the model coefficient vector.
Otherwise, if se.fit
is TRUE
then a 2 item list is returned with items (both arrays) fit
and se.fit
containing predictions and associated standard error estimates, otherwise an
array of predictions is returned. The dimensions of the returned arrays depends on whether
type
is "terms"
or not: if it is then the array is 2 dimensional with each
term in the linear predictor separate, otherwise the array is 1 dimensional and contains the
linear predictor/predicted values (or corresponding s.e.s). The linear predictor returned termwise will
not include the offset or the intercept.
newdata
can be a data frame, list or model.frame: if it's a model frame
then all variables must be supplied.
WARNING
Predictions are likely to be incorrect if data dependent transformations of the covariates
are used within calls to smooths. See examples in predict.gam
.
Author(s)
Simon N. Wood simon.wood@r-project.org
The design is inspired by the S function of the same name described in Chambers and Hastie (1993) (but is not a clone).
References
Chambers and Hastie (1993) Statistical Models in S. Chapman & Hall.
Marra, G and S.N. Wood (2012) Coverage Properties of Confidence Intervals for Generalized Additive Model Components. Scandinavian Journal of Statistics.
Wood S.N. (2006b) Generalized Additive Models: An Introduction with R. Chapman and Hall/CRC Press.
See Also
Examples
## for parallel computing see examples for ?bam
## for general useage follow examples in ?predict.gam