Plotting with the distfreereg Package

Introduction

Plotting methods for distfreereg and compare objects are intended to provide sensible defaults, almost all of which can be modified easily as needed.

While the examples below are separated into distfreereg and compare classes, the way modifications are made in each case is similar.

Plotting distfreereg Objects

The following code creates the distfreereg object with a fairly simple setup that is used in most of the examples that follow.

set.seed(20240214)
n <- 1e2
true_mean <- function(X, theta) theta[1] + theta[2]*X[,1]
theta <- c(2,5)

X <- matrix(runif(n, min = 1, max = 100))
Y <- true_mean(X, theta) + rnorm(n)

dfr <- distfreereg(Y = Y, X = X, test_mean = true_mean,
                   covariance = list(Sigma = 1),
                   theta_init = rep(1, length(theta)))

Density Plots

The default plot displays the estimated density of the simulated statistics:

plot(dfr)

The estimated density of the simulated statistic is plotted, as is a vertical line showing the observed statistic. The upper tail is shaded, and the p-value (the area of the shaded region) is shown. A 95% simultaneous confidence band for the estimated density function is plotted, as well.

Modifying the P-Value Label

The plot above is not ideal because the p-value label overlaps the density curve. The specifications for this label can be modified using the text_args argument, the elements of which are passed to graphics::text().

The default placement is vertically centered on the left side of the vertical line. By using the text_args argument, the text can be printed on the right side of the line and shifted down a bit. Note that only the elements to change need to be specified in the list supplied to text_args.

plot(dfr, text_args = list(adj = c(0, 0.5), y = 0.5))

One special value, text_args = FALSE, prevents the label from being printed.

plot(dfr, text_args = FALSE)

Selecting the Statistic to Plot

The stat argument can be used to produce the corresponding plot for the CvM statistic.

plot(dfr, stat = "CvM", text_args = FALSE)

Modifying the Density Calculation and Plotting

The calculation of the density curve can be modified using density_args, whose elements are passed to density().

To modify the appearance of the curve once the density curve has been estimated, pass arguments to plot.default() via the ... argument. The example below shows how to modify the line type.

plot(dfr, text_args = FALSE, lty = 2)

Modifying the P-Value Line and Shading

The appearance of the vertical line can be modified using the abline_args argument, whose elements are passed to abline(). As with the text label, the line can be omitted by setting abline_args equal to FALSE.

plot(dfr, text_args = FALSE, abline_args = FALSE)

The shading under the curve is produced by calling polygon(), and can be modified by passing arguments to that function via polygon_args. It can also be omitted by setting polygon_args = FALSE.

plot(dfr, text_args = FALSE, polygon_args = list(density = 10))

As a convenience, the shading color can be changed using the shade_col argument. This is equivalent to modifying the col argument of polygon().

plot(dfr, text_args = FALSE, shade_col = rgb(0.5, 0.5, 0.8, 0.5))

Borders

The default behavior is to omit the border of the shaded region. As seen above, this is notable when the vertical line is omitted. This can be changed by setting border = NULL, its default value in polygon(), which (usually) results in a border.

plot(dfr, text_args = FALSE, abline_args = FALSE, polygon_args = list(border = NULL))

Output

In case the values used to create a plot are useful for further calculation, each call to plot.distfreereg() invisibly returns these values in a list with either two or three elements. The first two elements, x and y, contain the coordinates of the curve. If confidence bands are plotted, then a third element named confband is also returned.

output <- plot(dfr, confband_args = NULL, text_args = FALSE)

names(output)
## [1] "x"        "y"        "confband"

The \(y\)-values of the curves that determine the confidence band are saved in the cb_lower and cb_upper elements, while the \(x\)-values are saved in w.

names(output$confband)
## [1] "call"      "Sigma_hat" "radii"     "w"         "fnw"       "cb_lower" 
## [7] "cb_upper"

Diagnostic Plots

Below are examples of two diagnostic plots available through plot.distfreereg().

Residual Plots

A useful diagnostic plot displays the transformed residuals ordered according to the res_order element of the distfreereg object.

plot(dfr, which = "residuals")

As with the density plot, all options can be modified if needed by including additional arguments for plot().

plot(dfr, which = "residuals", main = "New Title", lty = "dashed")

Empirical Partial Sum Process Plots

Another useful diagnostic plot displays the values of the empirical partial sum process, which can be modified as expected.

plot(dfr, which = "epsp")

plot(dfr, which = "epsp", xlab = "i", col = "red")

Plotting compare Objects

Most of the examples of plot modifications apply to compare objects, as well. These are neverthless illustrated explicitly below.

Setup

The following code creates the compare object with a fairly simple setup that is used in all of the examples that follow.

set.seed(20240920)
n <- 100
func <- function(X, theta) theta[1] + theta[2]*X[,1] + theta[3]*X[,2]
theta <- c(2,5,-1)
X <- matrix(rexp(2*n), nrow = n)
cdfr <- compare(theta = theta, true_mean = func, test_mean = func,
                true_X = X, true_covariance = list(Sigma = 3), X = X,
                covariance = list(Sigma = 3), prog = Inf,
                theta_init = rep(1, length(theta)))

CDF Plots

By default, plot.compare() displays the graphs of the estimated cumulative distribution functions of the observed and simulated statistics.

plot(cdfr)

Curves

The appearance of the function curves can be modified using the curve_args argument. Note that this passes values to lines(), not curve(). The value of curve_args must be a list. If an argument of lines() is an element of this list, then its value is passed to the calls for both curves. For example, the width of both curves can be changed as follows.

plot(cdfr, curve_args = list(lwd = 3))

To change a property of only one curve, two special (named) elements of this list are available: “obs” and “mcsim”. Each of these, if present, must be a list. Their elements are passed to the lines() call of the corresponding curve. The following example shows how to change the thickness of both curves but the style of only the observed statistics curve.

plot(cdfr, curve_args = list(lwd = 3, obs = list(lty = 4)))

Legend

The argument legend can be used to modify the default behavior of the legend.

plot(cdfr, legend = list(title = "A Title", bg = "grey"))

While not recommended, it can be omitted by setting legend to “FALSE”.

plot(cdfr, legend = FALSE)

Horizontal Lines

The horizontal dashed lines are plotted by default to mimic the default behavior of plot.ecdf(). These can be modified using the hlines argument, whose value is a list of arguments to pass to abline().

plot(cdfr, hlines = list(lty = 1, lwd = 3))

Confidence Bands

Confidence bands are not plotted by default, but they can be included with default values as follows.

plot(cdfr, confband_args = NULL)

Density Plots

Estimated density curves can be plotted using the which argument.

plot(cdfr, which = "dens")

The area under each density curve is shaded by default. These can be modified, either together or separately, as can be done with curve_args. The example below shows how to change the density of the shading for both curves but the color and angle only of the observed statistics curve.

plot(cdfr, which = "dens",
     poly = list(density = 20, obs = list(col = rgb(0.5,0.2,0.2,0.2), angle = -45)))

Other options, such as curve_args, operate here as described above in the discussion of CDF plotting.

Q–Q Plots

The other plots available are Q–Q plots: one compares observed and simulated statistics, and the other compares p-values to uniform quantiles.

plot(cdfr, which = "qq")

plot(cdfr, which = "qqp")

Both of these plots accept optional lists of arguments to pass to qqplot()

plot(cdfr, which = "qq", conf.level = 0.95)

The diagonal line can be modified using the qqline argument, whose elements are passed to abline().

plot(cdfr, which = "qqp", qqline = list(lwd = 3))

Multiple compare Objects

It might be useful to compare the observed statistics in two compare objects. This can be done by supplying both objects to compare(). To illustrate, we first create a second compare object:

set.seed(20240920)
n <- 100
func <- function(X, theta) theta[1] + theta[2]*X[,1]
theta <- c(7,3)
X <- matrix(rexp(n), nrow = n)
cdfr2 <- compare(theta = theta, true_mean = func, test_mean = func,
                 true_X = X, true_covariance = list(Sigma = 3), X = X,
                 covariance = list(Sigma = 3), prog = Inf,
                 theta_init = rep(1, length(theta)))

The following call compares the observed statistics from the two compare objects.

plot(cdfr, cdfr2)

Output

In case the values used to create a plot are useful for further calculation, each call to plot.compare() invisibly returns these values in a list. The Q–Q plots both return the value returned by qqplot() itself. The other plots return values corresponding to their curves, including confidence bands, if plotted.

output <- plot(cdfr, confband_args = NULL)

names(output)
## [1] "observed"           "simulated"          "confband_observed" 
## [4] "confband_simulated"

The \(y\)-values of the curves that determine the confidence band are saved in the cb_lower and cb_upper elements, while the \(x\)-values are saved in w.

names(output$confband_observed)
## [1] "call"      "Sigma_hat" "radii"     "w"         "fnw"       "cb_lower" 
## [7] "cb_upper"