Confidence intervals

The negative log-likelihood (nll) functions in this package allow various survival models to be fit to observed survival data by maximum likelihood estimation. The actual variables estimated are the random variables defining the location and scale parameters of the survival models.

Both approaches are used by the function anovir::conf_ints_virulence to estimate the confidence intervals of a hazard function describing a pathogen's virulence. This function, however, is limited to estimates arising from the default form of anovir::nll_basic where the location and scale parameters for background mortality and mortality due to infection are described by the variables, a1,b1 and a2,b2, respectively.

Three worked examples are provided where the number of random variables contributing to the parameters of the hazard function are 1, 2 or 3.

The second example estimates the 95% confidence intervals of a hazard function describing a pathogen's virulence. The two variables estimated are those defining the location and scale parameters for host mortality due to infection. This type of analysis is compatible with the function anovir::conf_ints_virulence and illustrates the sequence of steps taken by the function.

Finally, a worked example also shows how to calculate 95% confidence intervals on the natural scale for the results when they are calculated with nll_basic_logscale

The delta method

The delta method provides a means to estimate the approximate variance of a function when the function is a function of one or more random variables, and where there is an estimate for the variance of each random variable. For example if g is a function of f, where f is a function of the random variables, \(X_1,X_2,\dots,X_n\), such that,

\[ g = f(X_1,X_2,\dots,X_n)\] and estimates for the variance of each \(X_i\) are available, then by the delta method the first order approximation for the variance of g is,

\[\begin{align} \textrm{Var}\left(g\right) &\approx \textrm{Var}\left[f\left(X_1, X_2,\dots, X_n \right) \right] \\ \\ &= \sum_{i=1}^{n} \left( \frac{\partial f}{\partial X_i} \right)^2 \textrm{Var}\left( X_i \right) + 2 \sum_{i=1}^n \sum_{j=1}^{n} \left(\frac{\partial f}{\partial X_i}\right) \left(\frac{\partial f}{\partial X_j}\right) \textrm{Cov}\left(X_i, X_j \right) \end{align}\]

where \(\partial f / \partial X_i\) is the expression for the first partial derivative of \(f(\cdot)\) with respect to \(X_i\), while \(\textrm{Var}(X_i)\) and \(\textrm{Cov}(X_i,X_j)\) are terms for the variance and covariances of the random variables, respectively.

This first order approximation is equivalent to the function being described by a tangent line intersecting it at its mean. Higher order approximations are equivalent to describing the function around its mean with polynomial expressions to a higher degree. Such expressions are progressively more flexible and provide a better description of the function. However the added value of approximations of order two or above rapidly goes towards zero if the function is approximately linear about its mean. In such cases there is little benefit to going beyond a first order approximation.

Setting confidence interval bounds

The approximate nature of a first order approximation for the variance of a hazard function means it can yield lower estimates that go below zero. This situation can be remedied by a back-calculation of the confidence intervals estimated on a logscale.

Let L(t) be the log transformation of the hazard function h(t),

\[ L(t) = \log \left[h(t)\right] \] Define the lower and upper bounds for \(L\left(t\right)\), based on its estimator, \(\hat{L}\left(t\right)\),

\[\left[\hat{L}(t)-A,\; \hat{L}(t)+A \right] \] where A is the distance from the mean to the lower or upper bounds of the confidence interval. The back-transformation from the log-scale gives the lower and upperbounds as, \[\left[\exp\left(\hat{L}[t]-A\right),\; \exp\left( \hat{L}[t]+A \right) \right] \] this equals,

\[\left[ \exp\left(\hat{L}\left[t\right]\right) \cdot \exp \left( -A \right),\; \exp\left( \hat{L}\left[t\right] \right) \cdot \exp\left( A \right) \right]\]

The back-transformation of \(\hat{L}\left(t\right)\) is, \[ \hat{h}\left(t\right) = \exp \left(\hat{L}\left[t\right] \right)\]

which can be substituted into the bounds above to give,

\[\left[ \hat{h}\left(t\right) \cdot \exp \left(-A\right),\; \hat{h}\left(t\right) \cdot \exp\left( A\right) \right]\] For a 95% confidence interval,

\[\begin{align} A &= 1.96 \cdot \sqrt{\textrm{Var}\left[\hat{L}\left(t\right)\right]} \\ &= 1.96 \cdot \sqrt{\textrm{Var}\left[\log\left(\hat{h}\left[t\right]\right)\right]} \end{align}\]

Use the delta method to estimate \(\textrm{Var}\left[\log\left(\hat{h}\left[t\right]\right)\right]\) as a function of \(\hat{h}\left(t\right)\);

\[\begin{align} X &= \hat{h}\left(t\right) \\ Y &= \log \left(X\right) \\ \textrm{Var}\left(Y\right) &\approx \textrm{Var}\left(\log\left[X\right]\right) \\ \textrm{Var}\left(\log \left[X\right]\right)& = \left( \frac{\partial \log\left[X\right]}{\partial X}\right)^2 \cdot \textrm{Var}\left(X\right) \\ \textrm{Var}\left(\log \left[\hat{h}\left(t\right)\right]\right)&= \left( \frac{1}{X}\right)^2 \cdot \textrm{Var}\left(X\right) \\ & = \left( \frac{1}{\hat{h}\left(t\right)}\right)^2 \cdot \textrm{Var}\left(\hat{h}\left[t\right]\right) \end{align}\]

Consequently the A term above becomes;

\[\begin{align} A &= 1.96 \cdot \sqrt{\left( \frac{1}{\hat{h}\left(t\right)} \right)^2 \cdot \textrm{Var}\left(\hat{h}\left[t\right]\right)} \\ &= \frac{1.96 \cdot \sqrt{\textrm{Var}\left[\hat{h}\left(t\right)\right]}}{\hat{h}\left(t\right)} \end{align}\]

and the 95% confidence intervals,

\[\hat{h}\left(t\right) \cdot \exp \left( \frac{\pm 1.96 \cdot\sqrt{\textrm{Var}\left[\hat{h}\left(t\right)\right]}}{\hat{h}\left(t\right)} \right) \]

Worked example: 1 random variable

This example illustrates the estimation of the 95% confidence intervals of a hazard function based on a single random variable.

Sy et al. [1] recorded the survival of caged adult female Aedes aegypti mosquitoes over a three week period. The functions anovir::nll_basic and bbmle::mle2 were used to estimate patterns of mosquito mortality based on survival functions for the Weibull distribution.

The variables estimated were a1, b1, a2, b2 which defined the location and scale parameters for background mortality and mortality due to infection, respectively.

An initial analysis suggested the scale parameter for background mortality (b1) was close to one, such that, the background rate of mortality could be described by the exponential distribution. In this case the hazard function for background mortality at time t, \(\textit{h1}\left(t\right)\) is,

\[\textit{h1} \left(t\right) = \frac{1}{\exp \left(\textit{a1}\right)}\] where a1 is the location parameter and a random variable to be estimated.

In this case the approximate variance of the hazard function by the delta method is,

\[\begin{align} \textrm{Var}\left[\textit{h1}\left(t\right)\right] &\approx \left[ \frac{\partial \textit{h1}\left(t\right)}{\partial \textit{a1}}\right]^2 \textrm{Var}\left(\textit{a1}\right) \\ \\ &= \left[ \left( \frac{-1}{\exp \left[ \mu_\textit{a1} \right]} \right)^2 \right] \sigma^2_\textit{a1} \end{align}\]

where \(\mu_{\textit{a1}}\) and \(\sigma^2_{\textit{a1}}\) are the mean and variance of a1, respectively.

A revised model m02 set, b1 = 1.0, and gave the following estimates for parameters; a1, a2 and b2

  coef(m02)
#>       a1       b1       a2       b2 
#> 4.813000 1.000000 3.680324 0.636581

and the variance-covariance matrix

  vcov(m02)
#>              a1           a2          b2
#> a1  0.014285300 -0.002035729 0.002179394
#> a2 -0.002035729  0.012434151 0.006217163
#> b2  0.002179394  0.006217163 0.007388838

The lower and upper bounds of the 95% confidence intervals for the hazard function can be estimated as follows;

based on the delta method

  mu_a1 <- coef(m02)[[1]] ; mu_a1
#> [1] 4.813
  var_a1 <- vcov(m02)[1,1] ; var_a1
#> [1] 0.0142853

  h1t <- 1/(exp(mu_a1)) ; h1t
#> [1] 0.008123456
  var_h1t <- (-1/exp(mu_a1))^2 * var_a1 ; var_h1t
#> [1] 9.426946e-07

  lower_ci <- h1t - 1.96*sqrt(var_h1t)
  upper_ci <- h1t + 1.96*sqrt(var_h1t)
  lower_ci ; h1t ; upper_ci
#> [1] 0.006220444
#> [1] 0.008123456
#> [1] 0.01002647

based on delta method and a log transformation of the hazard function

  lower_ci2 <- h1t*exp(-1.96*sqrt(var_h1t)/h1t)
  upper_ci2 <- h1t*exp(+1.96*sqrt(var_h1t)/h1t)

  lower_ci2 ; h1t ; upper_ci2
#> [1] 0.006426912
#> [1] 0.008123456
#> [1] 0.01026784

Worked example: 2 random variables

In this example the confidence intervals for virulence are estimated for a Fréchet hazard function based on three random variables.

An analysis of a subset of the data from Blanford et al [2] found the virulence of three different isolates of the fungal pathogen were best described by hazard functions for the Fréchet distribution. Here the 95% confidence intervals for the virulence of strain Ma07 are estimated.

Part I: estimate parameter variables


library(anovir)

data01 <- data_blanford 

data01 <- subset(data01,
    (data01$block == 5) & (
      (data01$treatment == 'cont') |
      (data01$treatment == 'Ma07') 
    ) &
    (data01$day > 0)
    )

# create column 'g' as index of infected treatment
  data01$g <- data01$inf

  m01_prep_function <- function(a1, b1, a2, b2){
    nll_basic(a1, b1, a2, b2, data = data01, time = t, censor = censor,
              infected_treatment = g, d1 = 'Weibull', d2 = 'Fréchet')
    }

  m01 <- mle2(m01_prep_function,
              start = list(a1 = 2, b1 = 0.5, a2 = 3, b2 = 1)
              )

  summary(m01)
#> Maximum likelihood estimation
#> 
#> Call:
#> mle2(minuslogl = m01_prep_function, start = list(a1 = 2, b1 = 0.5, 
#>     a2 = 3, b2 = 1))
#> 
#> Coefficients:
#>    Estimate Std. Error z value     Pr(z)    
#> a1 3.486901   0.110589  31.530 < 2.2e-16 ***
#> b1 0.715964   0.059226  12.089 < 2.2e-16 ***
#> a2 1.640545   0.024162  67.897 < 2.2e-16 ***
#> b2 0.334838   0.018526  18.074 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> -2 log L: 2110.927

Part II: calculate confidence intervals


# view estimates
  coef(m01) ; vcov(m01)
#>        a1        b1        a2        b2 
#> 3.4869005 0.7159644 1.6405450 0.3348380
#>               a1            b1            a2            b2
#> a1  0.0122298917  5.019360e-03 -1.196194e-04 -1.202789e-04
#> b1  0.0050193604  3.507697e-03  8.290656e-05 -6.993859e-05
#> a2 -0.0001196194  8.290656e-05  5.838154e-04  9.126280e-05
#> b2 -0.0001202789 -6.993859e-05  9.126280e-05  3.432267e-04

# assign means, variances & covariances 
# means
  a2 <- m01@coef[[3]]
  b2 <- m01@coef[[4]]
# variances
  var_a2 <- m01@vcov[3,3]
  var_b2 <- m01@vcov[4,4]
# covariances
  cov_a2_b2 <- m01@vcov[3,4]

# specify timespan to calculate
  t <- 1:14

# write an 'expression' for the Fréchet hazard function
  # in terms of 'a2', 'b2' 
  hazard_expression <- expression(
    (1 / (b2 * t)) * exp(-((log(t) - a2) / b2) - exp(-((log(t) - a2) / b2))) / (1 - exp(-exp(-((log(t) - a2) / b2))))
  )

# prepare expression for 'stats::deriv` 
  # NB needs names of variables for partial differentiation 
  dh2_dt <- stats::deriv(hazard_expression, c('a2', 'b2'))

  dh2_dt
#> expression({
#>     .expr1 <- b2 * t
#>     .expr2 <- 1/.expr1
#>     .expr4 <- log(t) - a2
#>     .expr6 <- -(.expr4/b2)
#>     .expr7 <- exp(.expr6)
#>     .expr9 <- exp(.expr6 - .expr7)
#>     .expr10 <- .expr2 * .expr9
#>     .expr12 <- exp(-.expr7)
#>     .expr13 <- 1 - .expr12
#>     .expr15 <- 1/b2
#>     .expr16 <- .expr7 * .expr15
#>     .expr23 <- .expr13^2
#>     .expr27 <- .expr4/b2^2
#>     .expr28 <- .expr7 * .expr27
#>     .value <- .expr10/.expr13
#>     .grad <- array(0, c(length(.value), 2L), list(NULL, c("a2", 
#>         "b2")))
#>     .grad[, "a2"] <- .expr2 * (.expr9 * (.expr15 - .expr16))/.expr13 - 
#>         .expr10 * (.expr12 * .expr16)/.expr23
#>     .grad[, "b2"] <- (.expr2 * (.expr9 * (.expr27 - .expr28)) - 
#>         t/.expr1^2 * .expr9)/.expr13 - .expr10 * (.expr12 * .expr28)/.expr23
#>     attr(.value, "gradient") <- .grad
#>     .value
#> })
  
# evaluate (calculate) hazard function 'h2'
  h2 <- eval(dh2_dt)

# collect estimate of hazard function 'h2' 
# and 'gradients' of partial derivatives
  dh2 <- attr(h2, 'gradient')

# rename columns for clarity
  colnames(dh2) <- c('dh2_da2', 'dh2_db2')

# collect time, estimates of hazard function at time 't'
# and partial derivatives 
  dh2 <- cbind(t, h2, dh2)  

# calculate variance of hazard function
# using delta function  
  var_h2 <- 
    dh2[,'dh2_da2']^2 * var_a2 +
    dh2[,'dh2_db2']^2 * var_b2 +
    2 * dh2[,'dh2_da2'] * dh2[,'dh2_db2'] * cov_a2_b2

# calculate standard deviation of estimate
  sd_h2 <- sqrt(var_h2)

# bind to other estimates
  dh2 <- cbind(dh2, sd_h2)

# calculate lower & upper bounds of 95% confidence interval
  lower_ci <- dh2[,'h2'] * exp(-1.96 * dh2[,'sd_h2'] / dh2[,'h2'])
  upper_ci <- dh2[,'h2'] * exp( 1.96 * dh2[,'sd_h2'] / dh2[,'h2'])

# bind together
  dh2_ma07 <- cbind(dh2, lower_ci, upper_ci)

Part III: compare results with those of `conf_ints_virulence`

The calculations made in Part II of this example could have been done by anovir::conf_ints_virulence using the output of bbmle::mle2 for the location and scale parameters, their variance and covariance.


# values for a2, b2, var_a2, var_b2, cov_a2b2 were assigned 
# from results of 'bbmle::mle2' above
ls()
#>  [1] "a2"                "b2"                "cov_a2_b2"        
#>  [4] "data01"            "data_sy"           "dh2"              
#>  [7] "dh2_dt"            "dh2_ma07"          "h1t"              
#> [10] "h2"                "hazard_expression" "lower_ci"         
#> [13] "lower_ci2"         "m01"               "m01_prep_function"
#> [16] "m02"               "mu_a1"             "sd_h2"            
#> [19] "sy_censor"         "sy_fq"             "sy_inf"           
#> [22] "sy_times"          "t"                 "upper_ci"         
#> [25] "upper_ci2"         "var_a1"            "var_a2"           
#> [28] "var_b2"            "var_h1t"           "var_h2"

# tail of calculations above
tail(dh2_ma07)
#>        t        h2     dh2_da2    dh2_db2      sd_h2  lower_ci  upper_ci
#>  [9,]  9 0.3013620 -0.08804439 -1.0464000 0.01992884 0.2647269 0.3430669
#> [10,] 10 0.2784536 -0.05889901 -0.9480619 0.01790857 0.2454751 0.3158627
#> [11,] 11 0.2576075 -0.04076261 -0.8615484 0.01619091 0.2277501 0.2913792
#> [12,] 12 0.2390152 -0.02905187 -0.7870835 0.01474095 0.2118009 0.2697262
#> [13,] 13 0.2225442 -0.02123861 -0.7232670 0.01351348 0.1975732 0.2506712
#> [14,] 14 0.2079621 -0.01587353 -0.6684187 0.01246723 0.1849073 0.2338914

# specify mle2object 'm01', Frechet distribution & maximum time
ci_matrix01 <- conf_ints_virulence(
  a2 = a2,
  b2 = b2,
  var_a2 = var_a2,
  var_b2 = var_b2,
  cov_a2b2 = cov_a2_b2,
  d2 = "Frechet", 
  tmax = 14)  

# tail of calculations from 'conf_ints_virulence'
tail(ci_matrix01)
#>        t        h2     dh2_da2    dh2_db2      sd_h2  lower_ci  upper_ci
#>  [9,]  9 0.3013620 -0.08804439 -1.0464000 0.01992884 0.2647269 0.3430669
#> [10,] 10 0.2784536 -0.05889901 -0.9480619 0.01790857 0.2454751 0.3158627
#> [11,] 11 0.2576075 -0.04076261 -0.8615484 0.01619091 0.2277501 0.2913792
#> [12,] 12 0.2390152 -0.02905187 -0.7870835 0.01474095 0.2118009 0.2697262
#> [13,] 13 0.2225442 -0.02123861 -0.7232670 0.01351348 0.1975732 0.2506712
#> [14,] 14 0.2079621 -0.01587353 -0.6684187 0.01246723 0.1849073 0.2338914

# are the results the same? 
identical(dh2_ma07, ci_matrix01)
#> [1] TRUE

Worked example: 3 random variables

In this example the confidence intervals for virulence are estimated for a Weibull hazard function based on three random variables.

Analyses of the data from the study by Lorenz & Koella [3,4] suggested background mortality could be described by the Gumbel distribution and mortality due to infection by the Weibull distribution.

The location parameter describing mortality due to infection a2 was made a linear function of log[dose], requiring two random variables to estimated for its intercept and regression coefficient.

Part I: estimate parameter variables


library(anovir)

# get & rename data
  data01 <- data_lorenz 
  head(data01)
#>    Infectious.dose Food Sex Spore.Count    t censored d g
#> 1            10000  100   F      425000 13.0        0 1 1
#> 2            10000   50   F       22000  3.5        0 1 1
#> 3            10000  100   F      465000 20.5        0 1 1
#> 8                0   50   F          NA 17.0        0 1 0
#> 10          160000   50   F      295000 17.5        0 1 1
#> 11          160000  100   F           0  4.0        0 1 1

# create new version of function 'nll_basic'
  nll_basic2 <- nll_basic

# check location/definition of parameter function a2 ('pfa2')
  body(nll_basic2)[[4]]
#> pfa2 <- a2

# replace 'a2' making 'pfa2' a linear function of log(dose)
  body(nll_basic2)[[4]] <- substitute(pfa2 <- a2i + a2ii * log(data01$Infectious.dose))

# check new version of 'pfa2'
  body(nll_basic2)[[4]]
#> pfa2 <- a2i + a2ii * log(data01$Infectious.dose)

# update formals, including names for time, censor, etc..
  formals(nll_basic2) <- alist(a1 = a1, b1 = b1, a2i = a2i, a2ii = a2ii, b2 = b2,
    data = data01, time = t, censor = censored, infected_treatment = g,
    d1 = 'Gumbel', d2 = 'Weibull')

# prepare 'prep_function' for analysis
  m01_prep_function <- function(a1, b1, a2i, a2ii, b2){
    nll_basic2(a1, b1, a2i, a2ii, b2)
  }

# send to 'bbmle::mle2' with starting values for variables
  m01 <- mle2(m01_prep_function,
      start = list(a1 = 23, b1 = 4, a2i = 4, a2ii = -0.1, b2 = 0.2)
      )

# summary of results
  summary(m01)
#> Maximum likelihood estimation
#> 
#> Call:
#> mle2(minuslogl = m01_prep_function, start = list(a1 = 23, b1 = 4, 
#>     a2i = 4, a2ii = -0.1, b2 = 0.2))
#> 
#> Coefficients:
#>       Estimate Std. Error z value     Pr(z)    
#> a1   23.183049   0.655694 35.3565 < 2.2e-16 ***
#> b1    4.710901   0.382051 12.3306 < 2.2e-16 ***
#> a2i   3.835129   0.190420 20.1404 < 2.2e-16 ***
#> a2ii -0.080453   0.017615 -4.5673 4.941e-06 ***
#> b2    0.185591   0.019461  9.5365 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> -2 log L: 1506.186

Part II: calculate confidence intervals


# collect & assign results in mleobject 'm01'
  coef(m01)
#>          a1          b1         a2i        a2ii          b2 
#> 23.18304856  4.71090112  3.83512910 -0.08045345  0.18559113
  vcov(m01)
#>                a1            b1           a2i          a2ii            b2
#> a1    0.429934340  0.0106586090 -0.0391348080  2.854322e-03  2.150702e-03
#> b1    0.010658609  0.1459627064 -0.0081116490  8.870024e-04 -3.118436e-03
#> a2i  -0.039134808 -0.0081116490  0.0362596312 -3.326361e-03  2.809542e-04
#> a2ii  0.002854322  0.0008870024 -0.0033263614  3.102946e-04 -3.905585e-05
#> b2    0.002150702 -0.0031184359  0.0002809542 -3.905585e-05  3.787366e-04

# means
           a2i <- m01@coef[[3]]
          a2ii <- m01@coef[[4]]
            b2 <- m01@coef[[5]]
# variances
       var_a2i <- m01@vcov[3,3]
      var_a2ii <- m01@vcov[4,4]
        var_b2 <- m01@vcov[5,5]
# covariances
  cov_a2i_a2ii <- m01@vcov[3,4]
    cov_a2i_b2 <- m01@vcov[3,5]
   cov_a2ii_b2 <- m01@vcov[4,5]

# specify timespan to calculate
  t <- 1:28

# specify dose treatment to calculate
  dose <- 5000

# write an 'expression' for hazard function
  hazard_expression <- expression(
    1 / (b2 * t) * exp(((log(t) - (a2i + a2ii * log(dose)))) / b2)
    )

# prepare expression for 'stats::deriv` 
# has names of variables for partial differentiation 
  dh2_dt <- stats::deriv(hazard_expression, c('a2i', 'a2ii', 'b2'))

# evaluate (calculate) hazard function 'h2'
  h2 <- eval(dh2_dt)

# collect estimate of hazard function 'h2' 
# and 'gradients' of partial derivatives
  dh2 <- attr(h2, 'gradient')

# rename columns for clarity
  colnames(dh2) <- c('dh2_da2i', 'dh2_da2ii', 'dh2_db2')

# collect time, estimates of hazard function at time 't'
# and partial derivatives 
  dh2 <- cbind(t, h2, dh2)

# calculate variance of hazard function
# using delta function  
  var_h2 <- 
    dh2[,'dh2_da2i']^2 * var_a2i +
    dh2[,'dh2_da2ii']^2 * var_a2ii +
    dh2[,'dh2_db2']^2 * var_b2 +
    2 * dh2[,'dh2_da2i'] * dh2[,'dh2_da2ii'] * cov_a2i_a2ii +
    2 * dh2[,'dh2_da2i'] * dh2[,'dh2_db2'] * cov_a2i_b2 +
    2 * dh2[,'dh2_da2ii'] * dh2[,'dh2_db2'] * cov_a2ii_b2 

# calculate standard deviation of estimate
  sd_h2 <- sqrt(var_h2)

# bind to other estimates
  dh2 <- cbind(dh2, sd_h2)

# calculate lower & upper bounds of 95% confidence interval
  lower_ci <- dh2[,'h2'] * exp(-1.96 * dh2[,'sd_h2'] / dh2[,'h2'])
  upper_ci <- dh2[,'h2'] * exp( 1.96 * dh2[,'sd_h2'] / dh2[,'h2'])

# bind together
  dh2_5000 <- cbind(dh2, lower_ci, upper_ci)

Repeat calculations for dose = 160000; not shown.

Plot results;

Confidence intervals back-calculated from `nll_basic_logscale`

The nll_basic_logscale function assumes the values of the input variables are on a log-scale. Within the function these values are exponentiated before being sent to the likelihood function. The use of conf_ints_virulence will not provide the correct confidence intervals with the output from nll_basic_logscale.

The following code returns the same confidence intervals as that estimated for the same data without a log-transformation.

Part I: estimate parameter variables


library(anovir)

  data01 <- subset(data_blanford,
    (data_blanford$block == 3) &
      ((data_blanford$treatment == 'cont') |
      (data_blanford$treatment == 'Bb06')) &
    (data_blanford$day > 0)
    )

    l01_prep_function_log <- function(a1 = a1, b1 = b1, a2 = a2, b2 = b2){
      nll_basic_logscale(
        a1 = a1, b1 = b1, a2 = a2, b2 = b2,
        data = data01,
        time = t,
        censor = censor,
        infected_treatment = inf,
        d1 = 'Weibull', d2 = 'Weibull')
        }

    l01 <- mle2(l01_prep_function_log,
             start = list(a1 = 1, b1 = 1, a2 = 1, b2 = 1)
             )

    summary(l01)
#> Maximum likelihood estimation
#> 
#> Call:
#> mle2(minuslogl = l01_prep_function_log, start = list(a1 = 1, 
#>     b1 = 1, a2 = 1, b2 = 1))
#> 
#> Coefficients:
#>     Estimate Std. Error z value     Pr(z)    
#> a1  1.045594   0.024271 43.0791 < 2.2e-16 ***
#> b1 -0.728183   0.089151 -8.1680 3.135e-16 ***
#> a2  0.948089   0.011094 85.4622 < 2.2e-16 ***
#> b2 -1.697579   0.172747 -9.8269 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> -2 log L: 1293.144

Part II: calculate confidence intervals


# collect and assign results
  coef(l01)
#>         a1         b1         a2         b2 
#>  1.0455942 -0.7281833  0.9480889 -1.6975787
  vcov(l01)
#>               a1            b1            a2            b2
#> a1  0.0005891043  0.0011221235 -0.0001266466  0.0014110750
#> b1  0.0011221235  0.0079478157 -0.0001426868 -0.0014237986
#> a2 -0.0001266466 -0.0001426868  0.0001230693 -0.0006601062
#> b2  0.0014110750 -0.0014237986 -0.0006601062  0.0298416332

        a2 <- coef(l01)[[3]]
        b2 <- coef(l01)[[4]]

    var_a2 <- vcov(l01)[3,3]
    var_b2 <- vcov(l01)[4,4]

  cov_a2_b2 <- vcov(l01)[3,4]

# set timescale for calculations
  t <- 1:15

# NB need to exponentiate the terms with 'a2' and 'b2'
# in hazard expression
  hazard_expression <- expression(
    (1/(exp(b2) * t)) * exp((log(t) - exp(a2)) / exp(b2))
  )

# remaining steps are the same as in Example 2 above
  dh2_dt <- stats::deriv(hazard_expression, c('a2', 'b2'))
  h2 <- eval(dh2_dt)
  dh2 <- attr(h2, 'gradient')
  colnames(dh2) <- c('dh2_da2', 'dh2_db2')
  dh2 <- cbind(t, h2, dh2) 
  var_h2 <- 
    dh2[,'dh2_da2']^2 * var_a2 +
    dh2[,'dh2_db2']^2 * var_b2 +
    2 * dh2[,'dh2_da2'] * dh2[,'dh2_db2'] * cov_a2_b2
  sd_h2 <- sqrt(var_h2)
  dh2 <- cbind(dh2, sd_h2)
  lower_ci <- dh2[,'h2'] * exp(-1.96 * dh2[,'sd_h2'] / dh2[,'h2'])
  upper_ci <- dh2[,'h2'] * exp( 1.96 * dh2[,'sd_h2'] / dh2[,'h2'])

  dh2_backlog <- cbind(dh2, lower_ci, upper_ci)
  dh2_backlog <- round(dh2_backlog,4)
  dh2_backlog
#>        t     h2  dh2_da2 dh2_db2  sd_h2 lower_ci upper_ci
#>  [1,]  1 0.0000  -0.0001  0.0001 0.0000   0.0000   0.0004
#>  [2,]  2 0.0001  -0.0013  0.0008 0.0002   0.0000   0.0024
#>  [3,]  3 0.0006  -0.0078  0.0039 0.0007   0.0000   0.0069
#>  [4,]  4 0.0020  -0.0283  0.0111 0.0020   0.0003   0.0148
#>  [5,]  5 0.0054  -0.0765  0.0234 0.0044   0.0011   0.0266
#>  [6,]  6 0.0122  -0.1725  0.0405 0.0079   0.0035   0.0431
#>  [7,]  7 0.0244  -0.3432  0.0601 0.0122   0.0091   0.0651
#>  [8,]  8 0.0442  -0.6226  0.0768 0.0169   0.0208   0.0936
#>  [9,]  9 0.0747  -1.0529  0.0818 0.0212   0.0428   0.1303
#> [10,] 10 0.1195  -1.6847  0.0621 0.0245   0.0799   0.1788
#> [11,] 11 0.1829  -2.5772 -0.0002 0.0286   0.1346   0.2484
#> [12,] 12 0.2696  -3.7994 -0.1285 0.0403   0.2011   0.3614
#> [13,] 13 0.3853  -5.4297 -0.3520 0.0693   0.2708   0.5481
#> [14,] 14 0.5362  -7.5569 -0.7069 0.1220   0.3433   0.8376
#> [15,] 15 0.7295 -10.2802 -1.2365 0.2046   0.4210   1.2639

  # the confidence intervals are the same as those
  # estimated on the help page of
  # `conf_ints_virulence` for variables that are 
  # not assumed to be on a logscale

References

1. Sy VE, Agnew P, Sidobre C, Michalakis Y. 2014 Reduced survival and reproductive success generates selection pressure for the dengue mosquito aedes aegypti to evolve resistance against infection by the microsporidian parasite vavraia culicis. Evolutionary Applications 7, 468–479. (doi:10.1111/eva.12144)

2. Blanford S, Jenkins NE, Read AF, Thomas MB. 2012 Evaluating the lethal and pre-lethal effects of a range of fungi against adult anopheles stephensi mosquitoes. Malaria Journal 11, 10. (doi:10.1186/1475-2875-11-365)

3. Lorenz LM, Koella JC. 2011 The microsporidian parasite vavraia culicis as a potential late life-acting control agent of malaria. Evolutionary Applications 4, 783–790. (doi:10.1111/j.1752-4571.2011.00199.x)

4. Lorenz LM, Koella JC. 2011 Data from: The microsporidian parasite vavraia culicis as a potential late life-acting control agent of malaria. Dryad Digital Repository (doi:10.5061/dryad.2s231)

Confidence intervals

The delta method

Setting confidence interval bounds

Worked example: 1 random variable

Worked example: 2 random variables

Part I: estimate parameter variables

Part II: calculate confidence intervals

Part III: compare results with those of conf_ints_virulence

Worked example: 3 random variables

Part I: estimate parameter variables

Part II: calculate confidence intervals

Confidence intervals back-calculated from nll_basic_logscale

Part I: estimate parameter variables

Part II: calculate confidence intervals

References

Part III: compare results with those of `conf_ints_virulence`

Confidence intervals back-calculated from `nll_basic_logscale`