plot.lm {stats}  R Documentation 
Plot Diagnostics for an lm
Object
Description
Six plots (selectable by which
) are currently available: a plot
of residuals against fitted values, a ScaleLocation plot of
\sqrt{ residuals }
against fitted values, a QQ plot of residuals, a
plot of Cook's distances versus row labels, a plot of residuals
against leverages, and a plot of Cook's distances against
leverage/(1leverage). By default, the first three and 5
are
provided.
Usage
## S3 method for class 'lm'
plot(x, which = c(1,2,3,5),
caption = list("Residuals vs Fitted", "QQ Residuals",
"ScaleLocation", "Cook's distance",
"Residuals vs Leverage",
expression("Cook's dist vs Leverage* " * h[ii] / (1  h[ii]))),
panel = if(add.smooth) function(x, y, ...)
panel.smooth(x, y, iter=iter.smooth, ...) else points,
sub.caption = NULL, main = "",
ask = prod(par("mfcol")) < length(which) && dev.interactive(),
...,
id.n = 3, labels.id = names(residuals(x)), cex.id = 0.75,
qqline = TRUE, cook.levels = c(0.5, 1.0),
cook.col = 8, cook.lty = 2, cook.legendChanges = list(),
add.smooth = getOption("add.smooth"),
iter.smooth = if(isGlm) 0 else 3,
label.pos = c(4,2),
cex.caption = 1, cex.oma.main = 1.25
, extend.ylim.f = 0.08
)
Arguments
x 

which 
a subset of the numbers
See also ‘Details’ below. 
caption 
captions to appear above the plots;

panel 
panel function. The useful alternative to

sub.caption 
common title—above the figures if there are more
than one; used as 
main 
title to each plot—in addition to 
ask 
logical; if 
... 
other parameters to be passed through to plotting functions. 
id.n 
number of points to be labelled in each plot, starting with the most extreme. 
labels.id 
vector of labels, from which the labels for extreme
points will be chosen. 
cex.id 
magnification of point labels. 
qqline 
logical indicating if a 
cook.levels 
levels of Cook's distance at which to draw contours. 
cook.col , cook.lty 
color and line type to use for these contour lines. 
cook.legendChanges 
a 
add.smooth 
logical indicating if a smoother should be added to
most plots; see also 
iter.smooth 
the number of robustness iterations, the argument

label.pos 
positioning of labels, for the left half and right half of the graph respectively, for plots 13, 5, 6. 
cex.caption 
controls the size of 
cex.oma.main 
controls the size of the 
extend.ylim.f 
a numeric vector of length 1 or 2, to be used in

Details
sub.caption
—by default the function call—is shown as
a subtitle (under the xaxis title) on each plot when plots are on
separate pages, or as a subtitle in the outer margin (if any) when
there are multiple plots per page.
The ‘ScaleLocation’ plot (which=3
), also called ‘SpreadLocation’ or
‘SL’ plot, takes the square root of the absolute residuals in
order to diminish skewness (\sqrt{ E }
is much less skewed
than  E 
for Gaussian zeromean E
).
The ‘SL’, the QQ, and the ResidualLeverage (which=5
) plot use
standardized residuals which have identical variance (under the
hypothesis). They are given as
R_i / (s \times \sqrt{1  h_{ii}})
where the ‘leverages’ h_{ii}
are the diagonal entries
of the hat matrix,
influence()$hat
(see also hat
), and
where the ResidualLeverage plot uses the standardized Pearson residuals
(residuals.glm(type = "pearson")
) for R[i]
.
The ResidualLeverage plot (which=5
) shows contours of equal Cook's distance,
for values of cook.levels
(by default 0.5 and 1) and omits
cases with leverage one with a warning. If the leverages are constant
(as is typically the case in a balanced aov
situation)
the plot uses factor level combinations instead of the leverages for
the xaxis. (The factor levels are ordered by mean fitted value.)
In the Cook's distance vs leverage/(1leverage) (= “leverage*”)
plot (which=6
), contours of
standardized residuals (rstandard(.)
) that are equal in
magnitude are lines through the origin. These lines are labelled with
the magnitudes. The xaxis is labeled with the (non equidistant)
leverages h_{ii}
.
For the glm
case, the QQ plot is based on the absolute value
of the standardized deviance residuals. When the saddlepoint
approximation applies, these have an approximate halfnormal
distribution. The saddlepoint approximation is exact for the normal
and inverse Gaussian family, and holds approximately for the Gamma
family with small dispersion (large shape) and for the Poisson and
binomial families with large counts (Dunn and Smyth 2018).
Author(s)
John Maindonald and Martin Maechler.
References
Belsley, D. A., Kuh, E. and Welsch, R. E. (1980). Regression Diagnostics. New York: Wiley.
Cook, R. D. and Weisberg, S. (1982). Residuals and Influence in Regression. London: Chapman and Hall.
Firth, D. (1991) Generalized Linear Models. In Hinkley, D. V. and Reid, N. and Snell, E. J., eds: Pp. 5582 in Statistical Theory and Modelling. In Honour of Sir David Cox, FRS. London: Chapman and Hall.
Hinkley, D. V. (1975). On power transformations to symmetry. Biometrika, 62, 101–111. doi:10.2307/2334491.
McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models. London: Chapman and Hall.
Dunn, P.K. and Smyth G.K. (2018) Generalized Linear Models with Examples in R. New York: SpringerVerlag.
See Also
termplot
, lm.influence
,
cooks.distance
, hatvalues
.
Examples
require(graphics)
## Analysis of the lifecycle savings data
## given in Belsley, Kuh and Welsch.
lm.SR < lm(sr ~ pop15 + pop75 + dpi + ddpi, data = LifeCycleSavings)
plot(lm.SR)
## 4 plots on 1 page;
## allow room for printing model formula in outer margin:
par(mfrow = c(2, 2), oma = c(0, 0, 2, 0)) > opar
plot(lm.SR)
plot(lm.SR, id.n = NULL) # no id's
plot(lm.SR, id.n = 5, labels.id = NULL) # 5 id numbers
## Was default in R <= 2.1.x:
## Cook's distances instead of ResidualLeverage plot
plot(lm.SR, which = 1:4)
## All the above fit a smooth curve where applicable
## by default unless "add.smooth" is changed.
## Give a smoother curve by increasing the lowess span :
plot(lm.SR, panel = function(x, y) panel.smooth(x, y, span = 1))
par(mfrow = c(2,1)) # same oma as above
plot(lm.SR, which = 1:2, sub.caption = "Saving Rates, n=50, p=5")
## Cook's distance tweaking
par(mfrow = c(2,3)) # same oma ...
plot(lm.SR, which = 1:6, cook.col = "royalblue")
## A case where over plotting of the "legend" is to be avoided:
if(dev.interactive(TRUE)) getOption("device")(height = 6, width = 4)
par(mfrow = c(3,1), mar = c(5,5,4,2)/2 +.1, mgp = c(1.4, .5, 0))
plot(lm.SR, which = 5, extend.ylim.f = c(0.2, 0.08))
plot(lm.SR, which = 5, cook.lty = "dotdash",
cook.legendChanges = list(x = "bottomright", legend = "Cook"))
plot(lm.SR, which = 5, cook.legendChanges = NULL) # no "legend"
par(opar) # reset par()s