Using nll functions

library(anovir)

Introduciton

This vignette explains how to use the default form of the negative log-likelihood (nll) functions in this package.

In their default form, the various parameters underlying each nll_function are estimated as constants. However, this behaviour can be extended to make these parameters into functions themselves. In such cases, it is the parameters of these parameter functions that are estimated by constants.

How to modify nll_functions is described in a separate vignette: nll_functions_modifying

Two-step preparation

The nll_functions in this package calcuate the negative log-likelihood (nll) for observed patterns of mortality in survival data based on the approach analysing relative survival. The different functions vary in how they assume the data are structured. However the way to use each function is the same and involves a two-step process;

The resulting estimates of these variables can be used to describe the patterns of background mortality and mortality due to infection in the observed data, including the pathogen's virulence.

The following sections provide further details of the two-step process.

Step #1

The first step is to write a preparatory-, or prep-, function which collects the information needed to define and parameterise the likelihood expression to be minimised.

The example below shows Step #1 preparing the function nll_basic for analysis, given the data frame data01.

data01 is a subset of data from the study by Blanford et al [1]. The data are from the uninfected and infected treatments cont and Bb06, respectively, of the 3rd experimental block of the experiment.

head(data01, 6)
#>     block treatment replicate_cage day censor d inf t fq
#> 450     3      cont              1   1      0 1   0 1  0
#> 451     3      cont              1   2      0 1   0 2  0
#> 452     3      cont              1   3      0 1   0 3  0
#> 453     3      cont              1   4      0 1   0 4  1
#> 454     3      cont              1   5      0 1   0 5  0
#> 455     3      cont              1   6      0 1   0 6  1
    m01_prep_function <- function(a1, b1, a2, b2){
      nll_basic(
        a1, b1, a2, b2,
        data = data01,
        time = day,
        censor = censor,
        infected_treatment = inf,
        d1 = 'Weibull', d2 = 'Weibull')
        }

Here,

The information above is used to define a log-likelihood function of the form;

\[\begin{equation} \log L = \sum_{i=1}^{n} \left\{ d \log \left[ h_1(t_i) + g \cdot h_2(t_i) \right] + \log \left[ S_1(t_i) \right] + g \cdot \log \left[ S_2(t_i) \right] \right\} \end{equation}\]

where;

The Weibull distribution was chosen to describe the background rate of mortality and mortality due to infection.

The survival function describing background mortality was,

\[\begin{equation} S_1(t) = \exp \left[ - \exp \left( z_1 \right) \right] \end{equation}\]

with the hazard function being,

\[\begin{equation} h_1(t) = \frac{1}{b_1 t} \exp \left( z_1 \right) \end{equation}\]

where, \(z_1 = \left( \log t - a_1\right) / b_1 \)

Equivalent expressions described mortality due to infection, where the index '1' is replaced by '2'.

a1, b1, a2, b2 are the variables to be estimated, as listed in the formal arguments of m01_prep_function.

back to top

Step #2

Step #2 involves calling the mle2 function of the package bbmle [2] which will estimate the values of variables maximising the likelihood function.

In the example below, mle2 is given the name of the prep-function, m01_prep_function along with a list giving starting values for the variables to be estimated.

    m01 <- mle2(m01_prep_function,
             start = list(a1 = 2, b1 = 0.5, a2 = 2, b2 = 0.5)
             )

Yielding the results,

#> Maximum likelihood estimation
#> 
#> Call:
#> mle2(minuslogl = m01_prep_function, start = list(a1 = 2, b1 = 0.5, 
#>     a2 = 2, b2 = 0.5))
#> 
#> Coefficients:
#>    Estimate Std. Error z value     Pr(z)    
#> a1 2.845099   0.069055 41.2005 < 2.2e-16 ***
#> b1 0.482788   0.043041 11.2168 < 2.2e-16 ***
#> a2 2.580764   0.028630 90.1411 < 2.2e-16 ***
#> b2 0.183133   0.031634  5.7891 7.078e-09 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> -2 log L: 1293.144

The location and scale parameters describing mortality due to infection, a2, b2, lead to a numerical expression for pathogen's virulence at time t as,

\[\begin{equation} \frac{1}{0.183\: t} \exp \left( \frac{\log t - 2.581}{0.183} \right) \\ \end{equation}\]

where the rate of mortality due to infection accelerates over time.

The default of nll_basic returns estimates for, a1, b1, a2, b2. The function conf_ints_virulence can be used to generate a matrix with the estimate of virulence (± 95% c.i.) based on the variance and covariance of estimates for a2 and b2.


    coef(m01)
#>        a1        b1        a2        b2 
#> 2.8450992 0.4827880 2.5807642 0.1831328
    
    vcov(m01)
#>              a1            b1            a2            b2
#> a1  0.004768599  0.0015413625 -0.0009298910  0.0007351280
#> b1  0.001541362  0.0018525636 -0.0001777557 -0.0001259104
#> a2 -0.000929891 -0.0001777557  0.0008196927 -0.0003119921
#> b2  0.000735128 -0.0001259104 -0.0003119921  0.0010007282

    a2 <- coef(m01)[[3]]
    b2 <- coef(m01)[[4]]
    var_a2 <- vcov(m01)[3,3]
    var_b2 <- vcov(m01)[4,4]
    cov_a2b2 <- vcov(m01)[3,4]
    
    ci_matrix01 <- conf_ints_virulence(
      a2 = a2, b2 = b2,
      var_a2 = var_a2, var_b2 = var_b2, cov_a2b2 = cov_a2b2, 
      d2 = "Weibull", tmax = 14)

    tail(ci_matrix01)
#>        t         h2    dh2_da2      dh2_db2      sd_h2   lower_ci  upper_ci
#>  [9,]  9 0.07472008 -0.4080105  0.446496391 0.02120460 0.04284230 0.1303172
#> [10,] 10 0.11954723 -0.6527900  0.338799748 0.02453919 0.07994882 0.1787586
#> [11,] 11 0.18288260 -0.9986341 -0.001438483 0.02857553 0.13463847 0.2484138
#> [12,] 12 0.26960567 -1.4721871 -0.701596941 0.04030686 0.20112658 0.3614003
#> [13,] 13 0.38528853 -2.1038756 -1.922190342 0.06929851 0.27082267 0.5481345
#> [14,] 14 0.53622430 -2.9280634 -3.860097085 0.12200915 0.34329369 0.8375817

    plot(ci_matrix01[, 't'], ci_matrix01[, 'h2'], 
         type = 'l', col = 'red', xlab = 'time', ylab = 'virulence (± 95% ci)'
         )
      lines(ci_matrix01[, 'lower_ci'], col = 'grey')
      lines(ci_matrix01[, 'upper_ci'], col = 'grey')

back to top

References

1. Blanford S, Jenkins NE, Read AF, Thomas MB. 2012 Evaluating the lethal and pre-lethal effects of a range of fungi against adult anopheles stephensi mosquitoes. Malaria Journal 11, 10. (doi:10.1186/1475-2875-11-365)

2. Bolker, B. and R Development Core Team. 2017 Bbmle: Tools for general maximum likelihood estimation. See https://CRAN.R-project.org/package=bbmle.