The ExactMed functions

Introduction

This document aims to illustrate the usage of the functions exactmed(), exactmed_c() and exactmed_cat(), as well as their behavior via additional examples. All functions compute natural direct and indirect effects, and controlled direct effects for a binary outcome. However each function handles a specific type of mediator: exactmed() accommodates a binary mediator, exactmed_c() a continuous mediator and exactmed_cat() a categorical mediator. Details on the use of the function exactmed() are provided next. Usage of exactmed_c() and exactmed_cat() is similar to that of exactmed(), but differs on some aspects described thereafter.

In exactmed(), the user can specify the high levels of the outcome and mediator variables using the input parameters hvalue_m and hvalue_y, respectively (see the function help). Controlled direct effects are obtained for both possible mediator values (\(m=0\) and \(m=1\)). Natural and controlled effects can either be unadjusted (crude) or adjusted for covariates (that is, conditional effects). By default, adjusted effects estimates are obtained for covariates fixed at their sample-specific mean values (for numerical covariates and categorical covariates through associated dummies). Alternatively, adjusted effects estimates can be obtained for specific values of the covariates that are user-provided. Also, by default, exactmed() incorporates a mediator-exposure interaction term in the outcome model, which can be removed by setting interaction=FALSE. Concerning interval estimates, exactmed() generates, by default, \(95\%\) confidence intervals obtained by the delta method. Alternatively, percentile bootstrap confidence intervals, instead of delta method confidence intervals, can be obtained by specifying boot=TRUE in the function call. In this case, 1000 bootstrap data sets are generated by default.

In exactmed_c() and exactmed_cat(), only the high level of the outcome variable can be specified (using the input parameter hvalue_y). Moreover, for each scale, the controlled direct effect is computed at a mediator value or level specified by the user using the parameter mf. By default, this parameter is fixed at the sample-specific mean of the mediator in exactmed_c(), whereas it is fixed at the reference level of the mediator in exactmed_cat(). In order to use exactmed_cat(), the mediator must be coded as a factor variable in the data set. By default, the reference level of the mediator is the first level of the corresponding factor variable. The extra input parameter blevel_m of the exactmed_cat() function allows the user to change the default reference level to any other level. It is worth noting that parameter blevel_m only potentially impacts the value of the controlled direct effect (not the natural direct and indirect effects).

Due to the similarity between exactmed(), exactmed_c() and exactmed_cat() in terms of use and options offered to the user, most examples will be presented with the exactmed() function. In all the exactmed() examples presented below we use the data set datamed, available after loading the ExactMed package. Some of the features of this data set can be found in its corresponding help file (help(datamed)). A user interested in the exactmed_c() or exactmed_cat() functions for the continuous or categorical mediator cases, respectively, will only need to change the name of the function (and data set) in the calling of these examples to understand their use. The data sets datamed_c and datamed_cat, which feature a continuous and a categorical mediator, respectively, are presented at the end of the document along with a few calling examples.

Lastly, we recall that all ExactMed functions only work on data frames with named columns and no missing values.

> library(ExactMed)
> 
> head(datamed)
    X M Y C1         C2
  1 1 1 0  0  0.3753731
  2 1 0 0  1  0.1971635
  3 1 0 1  1 -0.5971041
  4 0 1 0  0  0.7576990
  5 0 1 0  0  1.2056864
  6 0 1 0  0 -0.5983204

The following command verifies whether the data set contains any missing values:

> 
> as.logical(sum(is.na(datamed)))
  [1] FALSE

Basic examples

Suppose that one wishes to obtain unadjusted (crude) mediation effects estimates for a change in exposure from \(0\) to \(1\), assuming there is no exposure-mediator interaction and using the delta method to construct \(95\%\) confidence intervals.

In this case, a valid call to exactmed() would be:

> 
> results1 <- exactmed(
+   data = datamed, a = 'X', m = 'M', y = 'Y', 
+   a1 = 1, a0 = 0, interaction = FALSE
+   )  
  'exactmed' will compute unadjusted natural effects
> 
> results1
  
  Natural effects on OR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    2.04899   0.27469 1.57554 2.66472 8.7484e-08
  Indirect effect  0.88069   0.03097 0.82204 0.94352 0.00030218
  Total effect     1.80452   0.23756 1.39414 2.33571 7.3256e-06
  
  Natural effects on RR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    1.30183   0.06489 1.18065 1.43543 1.2130e-07
  Indirect effect  0.96248   0.01014 0.94281 0.98257 0.00028454
  Total effect     1.25298   0.06379 1.13399 1.38446 9.4328e-06
  
  Natural effects on RD scale: 
  
                  Estimate Std.error     2.5%    97.5%      P.val
  Direct effect    0.16514   0.03021  0.10593  0.22435 4.5850e-08
  Indirect effect -0.02672   0.00732 -0.04107 -0.01238 0.00026137
  Total effect     0.13842   0.03048  0.07868  0.19815 5.5775e-06
  
  Controlled direct effect (m=0): 
  
           Estimate Std.error    2.5%   97.5%      P.val
  OR scale  2.07495   0.28607 1.58363 2.71870 1.1937e-07
  RR scale  1.26843   0.05652 1.16234 1.38420 9.5079e-08
  RD scale  0.15878   0.02892 0.10210 0.21546 4.0057e-08
  
  Controlled direct effect (m=1): 
  
           Estimate Std.error    2.5%   97.5%      P.val
  OR scale  2.07495   0.28607 1.58363 2.71870 1.1937e-07
  RR scale  1.40138   0.09725 1.22317 1.60555 1.1566e-06
  RD scale  0.17947   0.03343 0.11395 0.24499 7.9365e-08
  
  
  Mediator model: 
  
  
  Call:
  glm(formula = Mform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept) -0.81241    0.09811  -8.281  < 2e-16 ***
  X            0.90623    0.13212   6.859 6.92e-12 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1359.3  on 999  degrees of freedom
  Residual deviance: 1310.8  on 998  degrees of freedom
  AIC: 1314.8
  
  Number of Fisher Scoring iterations: 4
  
  
  Outcome model: 
  
  
  Call:
  glm(formula = Yform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept)   0.3702     0.1015   3.648 0.000264 ***
  X             0.7299     0.1379   5.294 1.19e-07 ***
  M            -0.5825     0.1388  -4.198 2.69e-05 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1330.1  on 999  degrees of freedom
  Residual deviance: 1291.9  on 997  degrees of freedom
  AIC: 1297.9
  
  Number of Fisher Scoring iterations: 4

Mediation effects estimates adjusted for covariates are obtained through the use of the character vectors m_cov and y_cov, which contain the names of the covariates to be adjusted for in the mediator and outcome models, respectively. The following call to exactmed() incorporates covariates C1 and C2 in both the mediator and outcome models:

> 
> results2 <- exactmed(
+   data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,  
+   m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'), 
+   interaction = FALSE
+   )
> 
> results2
  
  Natural effects on OR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    1.91812   0.27925 1.44197 2.55150 7.6751e-06
  Indirect effect  0.99263   0.05626 0.88826 1.10927    0.89619
  Total effect     1.90399   0.26714 1.44622 2.50664 4.4407e-06
  
  Natural effects on RR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    1.27053   0.06748 1.14492 1.40991 6.5375e-06
  Indirect effect  0.99782   0.01667 0.96568 1.03103    0.89593
  Total effect     1.26776   0.06670 1.14354 1.40547 6.5043e-06
  
  Natural effects on RD scale: 
  
                  Estimate Std.error     2.5%   97.5%      P.val
  Direct effect    0.15019   0.03272  0.08605 0.21432 4.4402e-06
  Indirect effect -0.00154   0.01178 -0.02463 0.02155    0.89604
  Total effect     0.14865   0.03195  0.08602 0.21127 3.2871e-06
  
  Controlled direct effect (m=0): 
  
           Estimate Std.error    2.5%   97.5%      P.val
  OR scale  1.91814   0.27936 1.44182 2.55183 7.7378e-06
  RR scale  1.26961   0.06625 1.14619 1.40633 4.7657e-06
  RD scale  0.15000   0.03236 0.08657 0.21342 3.5640e-06
  
  Controlled direct effect (m=1): 
  
           Estimate Std.error    2.5%   97.5%      P.val
  OR scale  1.91814   0.27936 1.44182 2.55183 7.7378e-06
  RR scale  1.27393   0.07778 1.13025 1.43587 7.3286e-05
  RD scale  0.15087   0.03454 0.08318 0.21857 1.2534e-05
  
  
  Mediator model: 
  
  
  Call:
  glm(formula = Mform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept)  -0.6082     0.1404  -4.333 1.47e-05 ***
  X             1.4678     0.1718   8.542  < 2e-16 ***
  C1           -1.2652     0.1681  -7.527 5.19e-14 ***
  C2            1.6482     0.1153  14.292  < 2e-16 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1359.28  on 999  degrees of freedom
  Residual deviance:  929.49  on 996  degrees of freedom
  AIC: 937.49
  
  Number of Fisher Scoring iterations: 5
  
  
  Outcome model: 
  
  
  Call:
  glm(formula = Yform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept) -0.40420    0.13526  -2.988 0.002805 ** 
  X            0.65136    0.14564   4.472 7.74e-06 ***
  M           -0.02255    0.17259  -0.131 0.896063    
  C1           1.27063    0.14615   8.694  < 2e-16 ***
  C2          -0.29998    0.08356  -3.590 0.000331 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1330.1  on 999  degrees of freedom
  Residual deviance: 1205.3  on 995  degrees of freedom
  AIC: 1215.3
  
  Number of Fisher Scoring iterations: 4

The exactmed() function also allows for the specification of two different sets of covariates in the mediator and outcome models. For example, the following specification of m_cov and y_cov means that the mediator model is adjusted for C1 and C2, while the outcome model is adjusted for C1 only.

However, we advise against this practice unless it is known that excluded covariates are independent of the dependent variable (mediator or outcome) being modeled given the rest of covariates.

> 
> results3 <- exactmed(
+   data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,  
+   m_cov = c('C1', 'C2'), y_cov = c('C1'), 
+   interaction = FALSE
+   )
> 
> results3
  
  Natural effects on OR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    2.07747   0.29467 1.57326 2.74328 2.5397e-07
  Indirect effect  0.88888   0.04509 0.80476 0.98180   0.020229
  Total effect     1.84663   0.25710 1.40563 2.42600 1.0554e-05
  
  Natural effects on RR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    1.29629   0.06554 1.17400 1.43133 2.8560e-07
  Indirect effect  0.96677   0.01378 0.94013 0.99416   0.017754
  Total effect     1.25321   0.06518 1.13176 1.38771 1.4271e-05
  
  Natural effects on RD scale: 
  
                  Estimate Std.error     2.5%    97.5%      P.val
  Direct effect    0.16572   0.03132  0.10433  0.22710 1.2175e-07
  Indirect effect -0.02409   0.01021 -0.04410 -0.00409   0.018258
  Total effect     0.14162   0.03174  0.07941  0.20384 8.1377e-06
  
  Controlled direct effect (m=0): 
  
           Estimate Std.error    2.5%   97.5%      P.val
  OR scale  2.08515   0.29879 1.57458 2.76128 2.9258e-07
  RR scale  1.28132   0.06150 1.16628 1.40771 2.4082e-07
  RD scale  0.16264   0.03059 0.10269 0.22259 1.0544e-07
  
  Controlled direct effect (m=1): 
  
           Estimate Std.error    2.5%   97.5%      P.val
  OR scale  2.08515   0.29879 1.57458 2.76128 2.9258e-07
  RR scale  1.36112   0.09014 1.19544 1.54976 3.2297e-06
  RD scale  0.17702   0.03443 0.10954 0.24450 2.7259e-07
  
  
  Mediator model: 
  
  
  Call:
  glm(formula = Mform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept)  -0.6082     0.1404  -4.333 1.47e-05 ***
  X             1.4678     0.1718   8.542  < 2e-16 ***
  C1           -1.2652     0.1681  -7.527 5.19e-14 ***
  C2            1.6482     0.1153  14.292  < 2e-16 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1359.28  on 999  degrees of freedom
  Residual deviance:  929.49  on 996  degrees of freedom
  AIC: 937.49
  
  Number of Fisher Scoring iterations: 5
  
  
  Outcome model: 
  
  
  Call:
  glm(formula = Yform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept)  -0.2629     0.1282  -2.050   0.0403 *  
  X             0.7348     0.1433   5.128 2.93e-07 ***
  M            -0.3543     0.1460  -2.426   0.0153 *  
  C1            1.1916     0.1428   8.345  < 2e-16 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1330.1  on 999  degrees of freedom
  Residual deviance: 1218.5  on 996  degrees of freedom
  AIC: 1226.5
  
  Number of Fisher Scoring iterations: 4

By default, the adjusted parameter is TRUE. If the adjusted parameter is set to FALSE, exactmed() ignores the values of the vectors m_cov and y_cov and computes unadjusted (crude) effects estimates as in the first example above:

> 
> results4 <- exactmed(
+   data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0, 
+   m_cov = c('C1', 'C2'), y_cov = c('C1'), 
+   adjusted = FALSE, interaction = FALSE
+   )
  'exactmed' will compute unadjusted natural effects
> 
> results4
  
  Natural effects on OR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    2.04899   0.27469 1.57554 2.66472 8.7484e-08
  Indirect effect  0.88069   0.03097 0.82204 0.94352 0.00030218
  Total effect     1.80452   0.23756 1.39414 2.33571 7.3256e-06
  
  Natural effects on RR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    1.30183   0.06489 1.18065 1.43543 1.2130e-07
  Indirect effect  0.96248   0.01014 0.94281 0.98257 0.00028454
  Total effect     1.25298   0.06379 1.13399 1.38446 9.4328e-06
  
  Natural effects on RD scale: 
  
                  Estimate Std.error     2.5%    97.5%      P.val
  Direct effect    0.16514   0.03021  0.10593  0.22435 4.5850e-08
  Indirect effect -0.02672   0.00732 -0.04107 -0.01238 0.00026137
  Total effect     0.13842   0.03048  0.07868  0.19815 5.5775e-06
  
  Controlled direct effect (m=0): 
  
           Estimate Std.error    2.5%   97.5%      P.val
  OR scale  2.07495   0.28607 1.58363 2.71870 1.1937e-07
  RR scale  1.26843   0.05652 1.16234 1.38420 9.5079e-08
  RD scale  0.15878   0.02892 0.10210 0.21546 4.0057e-08
  
  Controlled direct effect (m=1): 
  
           Estimate Std.error    2.5%   97.5%      P.val
  OR scale  2.07495   0.28607 1.58363 2.71870 1.1937e-07
  RR scale  1.40138   0.09725 1.22317 1.60555 1.1566e-06
  RD scale  0.17947   0.03343 0.11395 0.24499 7.9365e-08
  
  
  Mediator model: 
  
  
  Call:
  glm(formula = Mform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept) -0.81241    0.09811  -8.281  < 2e-16 ***
  X            0.90623    0.13212   6.859 6.92e-12 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1359.3  on 999  degrees of freedom
  Residual deviance: 1310.8  on 998  degrees of freedom
  AIC: 1314.8
  
  Number of Fisher Scoring iterations: 4
  
  
  Outcome model: 
  
  
  Call:
  glm(formula = Yform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept)   0.3702     0.1015   3.648 0.000264 ***
  X             0.7299     0.1379   5.294 1.19e-07 ***
  M            -0.5825     0.1388  -4.198 2.69e-05 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1330.1  on 999  degrees of freedom
  Residual deviance: 1291.9  on 997  degrees of freedom
  AIC: 1297.9
  
  Number of Fisher Scoring iterations: 4

To perform an adjusted mediation analysis allowing for exposure-mediator interaction (by default, the interaction parameter is TRUE) and using bootstrap based on \(100\) resamples with initial random seed \(= 1991\) to construct \(97\%\) confidence intervals, one should call exactmed() as follows:

> 
> results5 <- exactmed(
+   data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0, 
+   m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'), 
+   boot = TRUE, nboot = 100, bootseed = 1991, confcoef = 0.97
+   )
> 
> results5
  
  Natural effects on OR scale: 
  
                  Estimate Std.error    1.5%   98.5%
  Direct effect    2.24870   0.37516 1.68812 3.32099
  Indirect effect  0.88856   0.07284 0.75874 1.07384
  Total effect     1.99810   0.28480 1.51926 2.71363
  
  Natural effects on RR scale: 
  
                  Estimate Std.error    1.5%   98.5%
  Direct effect    1.33700   0.07272 1.20829 1.48229
  Indirect effect  0.96726   0.02151 0.92984 1.02289
  Total effect     1.29323   0.06769 1.16111 1.44939
  
  Natural effects on RD scale: 
  
                  Estimate Std.error     1.5%   98.5%
  Direct effect    0.18403   0.03365  0.12012 0.25767
  Indirect effect -0.02390   0.01591 -0.05367 0.01544
  Total effect     0.16013   0.03112  0.09568 0.22870
  
  Controlled direct effect (m=0): 
  
           Estimate Std.error    1.5%   98.5%
  OR scale  2.61843   0.55335 1.82091 4.25427
  RR scale  1.41151   0.09319 1.25366 1.59634
  RD scale  0.21741   0.04099 0.14197 0.30413
  
  Controlled direct effect (m=1): 
  
           Estimate Std.error     1.5%   98.5%
  OR scale  1.30748   0.30771  0.79211 2.15864
  RR scale  1.10061   0.09454  0.92150 1.35743
  RD scale  0.06150   0.05285 -0.05274 0.18411
  
  First bootstrap replications of natural effects on OR scale: 
  
       Direct effect Indirect effect Total effect
  [1,]      2.456120       0.9408412     2.310819
  [2,]      2.133962       0.9077365     1.937075
  [3,]      1.811932       1.0971231     1.987913
  
  First bootstrap replications of natural effects on RR scale: 
  
       Direct effect Indirect effect Total effect
  [1,]      1.427061       0.9818925     1.401220
  [2,]      1.303377       0.9735270     1.268873
  [3,]      1.262116       1.0294193     1.299246
  
  First bootstrap replications of natural effects on RD scale: 
  
       Direct effect Indirect effect Total effect
  [1,]     0.2114903     -0.01279682    0.1986934
  [2,]     0.1704897     -0.01939046    0.1510993
  [3,]     0.1406345      0.01992190    0.1605564
  
  First bootstrap replications of controlled direct effect (m=0) on OR scale: 
  
  [1] 2.779720 2.412692 1.871147
  
  First bootstrap replications of controlled direct effect (m=0) on RR scale: 
  
  [1] 1.503371 1.356003 1.291167
  
  First bootstrap replications of controlled direct effect (m=0) on RD scale: 
  
  [1] 0.2401263 0.1963779 0.1501348
  
  First bootstrap replications of controlled direct effect (m=1) on OR scale: 
  
  [1] 1.643801 1.491866 1.593185
  
  First bootstrap replications of controlled direct effect (m=1) on RR scale: 
  
  [1] 1.210986 1.154799 1.163927
  
  First bootstrap replications of controlled direct effect (m=1) on RD scale: 
  
  [1] 0.11712924 0.09186107 0.10191851
  
  
  Mediator model: 
  
  
  Call:
  glm(formula = Mform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept)  -0.6082     0.1404  -4.333 1.47e-05 ***
  X             1.4678     0.1718   8.542  < 2e-16 ***
  C1           -1.2652     0.1681  -7.527 5.19e-14 ***
  C2            1.6482     0.1153  14.292  < 2e-16 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1359.28  on 999  degrees of freedom
  Residual deviance:  929.49  on 996  degrees of freedom
  AIC: 937.49
  
  Number of Fisher Scoring iterations: 5
  
  
  Outcome model: 
  
  
  Call:
  glm(formula = Yform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept)  -0.5207     0.1443  -3.609 0.000308 ***
  X             0.9626     0.1991   4.835 1.33e-06 ***
  M             0.3393     0.2308   1.470 0.141501    
  X:M          -0.6945     0.2929  -2.371 0.017746 *  
  C1            1.2771     0.1468   8.700  < 2e-16 ***
  C2           -0.3091     0.0838  -3.689 0.000225 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1330.1  on 999  degrees of freedom
  Residual deviance: 1199.6  on 994  degrees of freedom
  AIC: 1211.6
  
  Number of Fisher Scoring iterations: 4

Firth’s penalization

In the situation where we believe that we are facing a problem of separation or quasi-separation, Firth’s penalization can be used by setting the Firth parameter to TRUE (Firth penalized mediation analysis). If this is the case, Firth’s penalization is applied to both the mediator model and the outcome model.

The Firth parameter implements Firth’s penalization to reduce the bias of the regression coefficients estimators under scarce or sparse data (see details in exactmed() help page):

> 
> results6 <- exactmed(
+   data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0, 
+   m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'), Firth = TRUE, 
+   boot = TRUE, nboot = 100, bootseed = 1991, confcoef = 0.97
+   )
> 
> results6
  
  Natural effects on OR scale: 
  
                  Estimate Std.error    1.5%   98.5%
  Direct effect    2.23246   0.36864 1.68036 3.28209
  Indirect effect  0.88991   0.07197 0.76166 1.07288
  Total effect     1.98668   0.28090 1.51385 2.69182
  
  Natural effects on RR scale: 
  
                  Estimate Std.error    1.5%   98.5%
  Direct effect    1.33452   0.07217 1.20680 1.47867
  Indirect effect  0.96751   0.02135 0.93035 1.02269
  Total effect     1.29116   0.06719 1.16007 1.44614
  
  Natural effects on RD scale: 
  
                  Estimate Std.error     1.5%   98.5%
  Direct effect    0.18263   0.03343  0.11919 0.25573
  Indirect effect -0.02367   0.01576 -0.05316 0.01527
  Total effect     0.15896   0.03093  0.09498 0.22710
  
  Controlled direct effect (m=0): 
  
           Estimate Std.error    1.5%   98.5%
  OR scale  2.59835   0.54383 1.81264 4.20506
  RR scale  1.40883   0.09257 1.25203 1.59214
  RD scale  0.21597   0.04077 0.14098 0.30221
  
  Controlled direct effect (m=1): 
  
           Estimate Std.error     1.5%   98.5%
  OR scale  1.30556   0.30467  0.79391 2.14590
  RR scale  1.10034   0.09396  0.92202 1.35523
  RD scale  0.06124   0.05249 -0.05229 0.18295
  
  First bootstrap replications of natural effects on OR scale: 
  
       Direct effect Indirect effect Total effect
  [1,]      2.440099       0.9413288     2.296936
  [2,]      2.118584       0.9089399     1.925665
  [3,]      1.804247       1.0958134     1.977118
  
  First bootstrap replications of natural effects on RR scale: 
  
       Direct effect Indirect effect Total effect
  [1,]      1.424154       0.9819734     1.398481
  [2,]      1.301066       0.9737439     1.266905
  [3,]      1.260400       1.0291349     1.297121
  
  First bootstrap replications of natural effects on RD scale: 
  
       Direct effect Indirect effect Total effect
  [1,]     0.2101089     -0.01271723    0.1973917
  [2,]     0.1691184     -0.01918931    0.1499291
  [3,]     0.1397075      0.01970156    0.1594091
  
  First bootstrap replications of controlled direct effect (m=0) on OR scale: 
  
  [1] 2.761083 2.393397 1.862962
  
  First bootstrap replications of controlled direct effect (m=0) on RR scale: 
  
  [1] 1.500221 1.353409 1.289262
  
  First bootstrap replications of controlled direct effect (m=0) on RD scale: 
  
  [1] 0.2387231 0.1948956 0.1491569
  
  First bootstrap replications of controlled direct effect (m=1) on OR scale: 
  
  [1] 1.638567 1.487659 1.589117
  
  First bootstrap replications of controlled direct effect (m=1) on RR scale: 
  
  [1] 1.209995 1.154090 1.163587
  
  First bootstrap replications of controlled direct effect (m=1) on RD scale: 
  
  [1] 0.11647760 0.09132799 0.10154966
  
  
  Mediator model: 
  
  
  Call:
  glm(formula = Mform, family = binomial(link = "logit"), data = data, 
      method = brglmFit, type = "MPL_Jeffreys")
  
  Deviance Residuals: 
      Min       1Q   Median       3Q      Max  
  -2.4703  -0.7139  -0.2851   0.7728   2.6103  
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept)  -0.6042     0.1400  -4.314 1.60e-05 ***
  X             1.4589     0.1714   8.514  < 2e-16 ***
  C1           -1.2578     0.1677  -7.502 6.28e-14 ***
  C2            1.6365     0.1147  14.263  < 2e-16 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1359.28  on 999  degrees of freedom
  Residual deviance:  929.51  on 996  degrees of freedom
  AIC:  937.51
  
  Type of estimator: MPL_Jeffreys (maximum penalized likelihood with Jeffreys'-prior penalty)
  Number of Fisher Scoring iterations: 2
  
  Outcome model: 
  
  
  Call:
  glm(formula = Yform, family = binomial(link = "logit"), data = data, 
      method = brglmFit, type = "MPL_Jeffreys")
  
  Deviance Residuals: 
      Min       1Q   Median       3Q      Max  
  -2.2271  -1.0771   0.6198   0.9125   1.6972  
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept) -0.51661    0.14418  -3.583 0.000339 ***
  X            0.95488    0.19866   4.807 1.54e-06 ***
  M            0.33581    0.23059   1.456 0.145307    
  X:M         -0.68824    0.29254  -2.353 0.018639 *  
  C1           1.26829    0.14656   8.654  < 2e-16 ***
  C2          -0.30649    0.08368  -3.663 0.000250 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1330.1  on 999  degrees of freedom
  Residual deviance: 1199.6  on 994  degrees of freedom
  AIC:  1211.6
  
  Type of estimator: MPL_Jeffreys (maximum penalized likelihood with Jeffreys'-prior penalty)
  Number of Fisher Scoring iterations: 2

Stratum-specific effects

The following call to exactmed() returns mediation effects adjusted for the covariates C1 and C2, when the values of the covariates C1 and C2 are \(0.1\) and \(0.4\), respectively, assuming an exposure-mediator interaction and using the delta method to construct \(95\%\) confidence intervals:

> 
> results7 <- exactmed(
+   data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0, 
+   m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'), 
+   m_cov_cond = c(C1 = 0.1, C2 = 0.4), y_cov_cond = c(C1 = 0.1, C2 = 0.4)
+   )
> 
> results7
  
  Natural effects on OR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    1.86610   0.27725 1.39466 2.49690 2.6816e-05
  Indirect effect  0.89347   0.06371 0.77694 1.02747 0.11413480
  Total effect     1.66730   0.25143 1.24065 2.24066 0.00069917
  
  Natural effects on RR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    1.37432   0.10503 1.18315 1.59639 3.1743e-05
  Indirect effect  0.95099   0.03022 0.89357 1.01210 0.11377623
  Total effect     1.30697   0.10409 1.11809 1.52777 0.00077525
  
  Natural effects on RD scale: 
  
                  Estimate Std.error     2.5%   97.5%      P.val
  Direct effect    0.15465   0.03625  0.08360 0.22571 1.9913e-05
  Indirect effect -0.02783   0.01759 -0.06230 0.00664 0.11360218
  Total effect     0.12683   0.03702  0.05427 0.19938 0.00061244
  
  Controlled direct effect (m=0): 
  
           Estimate Std.error    2.5%   97.5%      P.val
  OR scale  2.61843   0.52125 1.77253 3.86803 1.3290e-06
  RR scale  1.63173   0.16066 1.34536 1.97906 6.5954e-07
  RD scale  0.23603   0.04712 0.14369 0.32838 5.4583e-07
  
  Controlled direct effect (m=1): 
  
           Estimate Std.error     2.5%   97.5%   P.val
  OR scale  1.30748   0.28346  0.85486 1.99974 0.21622
  RR scale  1.14677   0.12957  0.91897 1.43104 0.22549
  RD scale  0.06689   0.05389 -0.03873 0.17251 0.21448
  
  
  Mediator model: 
  
  
  Call:
  glm(formula = Mform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept)  -0.6082     0.1404  -4.333 1.47e-05 ***
  X             1.4678     0.1718   8.542  < 2e-16 ***
  C1           -1.2652     0.1681  -7.527 5.19e-14 ***
  C2            1.6482     0.1153  14.292  < 2e-16 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1359.28  on 999  degrees of freedom
  Residual deviance:  929.49  on 996  degrees of freedom
  AIC: 937.49
  
  Number of Fisher Scoring iterations: 5
  
  
  Outcome model: 
  
  
  Call:
  glm(formula = Yform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept)  -0.5207     0.1443  -3.609 0.000308 ***
  X             0.9626     0.1991   4.835 1.33e-06 ***
  M             0.3393     0.2308   1.470 0.141501    
  X:M          -0.6945     0.2929  -2.371 0.017746 *  
  C1            1.2771     0.1468   8.700  < 2e-16 ***
  C2           -0.3091     0.0838  -3.689 0.000225 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1330.1  on 999  degrees of freedom
  Residual deviance: 1199.6  on 994  degrees of freedom
  AIC: 1211.6
  
  Number of Fisher Scoring iterations: 4

Common adjustment covariates in vectors m_cov and y_cov must have the same values; otherwise, the execution of the exactmed() function is aborted and an error message is displayed in the R console. Example:

> exactmed(
+   data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0, 
+   m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'), 
+   m_cov_cond = c(C1 = 0.3, C2 = 0.4), y_cov_cond = c(C1 = 0.1, C2 = 0.4)
+  )
  Error in .check_input_param(data = data, a = a, m = m, y = y, a1 = a1, : Covariate C1 has two different values specified

If the covariates specified in m_cov_cond (y_cov_cond) constitute some proper subset of m_cov (y_cov) then the other covariates are set to their sample-specific mean levels. Hence, the call

> 
> results8 <- exactmed(
+   data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0, 
+   m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'), 
+   m_cov_cond = c(C1 = 0.1), y_cov_cond = c(C1 = 0.1)
+   )

is equivalent to:

> 
>  mc2 <- mean(datamed$C2)
>  mc2
  [1] -0.04769712
> 
> results9 <- exactmed(
+   data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0, 
+   m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'), 
+   m_cov_cond = c(C1 = 0.1, C2 = mc2), y_cov_cond = c(C1 = 0.1, C2 = mc2)
+   )

This can be checked by comparing the two outputs:

> 
> all.equal(results8, results9)
  [1] TRUE

With this in mind, an error is easily predicted if one makes this call:

> exactmed(
+   data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0, 
+   m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'), 
+   m_cov_cond = c(C1 = 0.1), y_cov_cond = c(C1 = 0.1, C2 = 0.4)
+   )
  Error in .check_input_param(data = data, a = a, m = m, y = y, a1 = a1, : Covariate C2 has two different values specified (one implicitly)

Categorical covariates

The exactmed() function also allows for categorical covariates. Covariates of this type must appear in the data frame as factor, character, or logical columns. To illustrate how exactmed() works with categorical covariates, we replace the covariate C1 in the data set datamed by a random factor column:

> 
> cate <- factor(sample(c("a", "b", "c"), nrow(datamed), replace =TRUE))
> datamed$C1 <- cate

It is possible to estimate mediation effects at specific values of categorical covariates using the input parameters m_cov_cond and y_cov_cond. Note that if the targeted covariates are a mixture of numerical and categorical covariates, the above parameters require to be list-type vectors, instead of atomic vectors as when covariates are only numerical or only categorical.

Hence, if one wants to estimate mediation effects at level ‘a’ for C1 and at value \(0.4\) for C2, assuming an exposure-mediator interaction and using the delta method to construct \(95\%\) confidence intervals, exactmed() should be called as follows:

> 
> results10 <- exactmed(
+   data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0, 
+   m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'), 
+   m_cov_cond = list(C1 = 'a', C2 = 0.4), y_cov_cond = list(C1 = 'a', C2 = 0.4)
+   )
> 
> results10
  
  Natural effects on OR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    1.93666   0.26913 1.47490 2.54298 1.9726e-06
  Indirect effect  0.80707   0.05473 0.70663 0.92179  0.0015722
  Total effect     1.56302   0.22095 1.18478 2.06201  0.0015806
  
  Natural effects on RR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    1.28784   0.07186 1.15443 1.43667 5.7977e-06
  Indirect effect  0.93157   0.02153 0.89031 0.97474  0.0021622
  Total effect     1.19971   0.07055 1.06911 1.34627  0.0019593
  
  Natural effects on RD scale: 
  
                  Estimate Std.error     2.5%    97.5%      P.val
  Direct effect    0.15482   0.03215  0.09180  0.21784 1.4724e-06
  Indirect effect -0.04740   0.01505 -0.07691 -0.01790  0.0016367
  Total effect     0.10742   0.03377  0.04123  0.17361  0.0014684
  
  Controlled direct effect (m=0): 
  
           Estimate Std.error    2.5%   97.5%      P.val
  OR scale  2.61643   0.50061 1.79824 3.80690 4.9841e-07
  RR scale  1.39356   0.09416 1.22072 1.59089 9.0296e-07
  RD scale  0.21365   0.04027 0.13473 0.29258 1.1236e-07
  
  Controlled direct effect (m=1): 
  
           Estimate Std.error     2.5%   97.5%   P.val
  OR scale  1.37503   0.28624  0.91436 2.06781 0.12605
  RR scale  1.14656   0.10575  0.95694 1.37375 0.13812
  RD scale  0.07787   0.05103 -0.02214 0.17789 0.12699
  
  
  Mediator model: 
  
  
  Call:
  glm(formula = Mform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept)  -0.8950     0.1615  -5.540 3.02e-08 ***
  X             1.3943     0.1656   8.419  < 2e-16 ***
  C1b          -0.5073     0.1934  -2.623  0.00871 ** 
  C1c          -0.2790     0.1906  -1.463  0.14334    
  C2            1.5664     0.1109  14.128  < 2e-16 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1359.28  on 999  degrees of freedom
  Residual deviance:  983.59  on 995  degrees of freedom
  AIC: 993.59
  
  Number of Fisher Scoring iterations: 5
  
  
  Outcome model: 
  
  
  Call:
  glm(formula = Yform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept)  0.25324    0.15193   1.667   0.0956 .  
  X            0.96181    0.19133   5.027 4.98e-07 ***
  M           -0.04639    0.21774  -0.213   0.8313    
  X:M         -0.64333    0.28142  -2.286   0.0223 *  
  C1b          0.00162    0.16759   0.010   0.9923    
  C1c         -0.13997    0.16069  -0.871   0.3837    
  C2          -0.20336    0.07922  -2.567   0.0103 *  
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1330.1  on 999  degrees of freedom
  Residual deviance: 1279.5  on 993  degrees of freedom
  AIC: 1293.5
  
  Number of Fisher Scoring iterations: 4

If one does not specify a value for the categorical covariate C1, exactmed() computes the effects by assigning each dummy variable, created internally by exactmed() for each non-reference level of C1, to a value equal to the proportion of observations in the corresponding category (equivalent to setting each dummy variable to its mean value):

> 
> results11 <- exactmed(
+   data = datamed, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0, 
+   m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'), 
+   m_cov_cond = c(C2 = 0.4), y_cov_cond = c(C2 = 0.4)
+   )
> 
> results11
  
  Natural effects on OR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    2.01817   0.28018 1.53741 2.64928 4.2356e-07
  Indirect effect  0.79861   0.05731 0.69383 0.91921 0.00172510
  Total effect     1.61173   0.22105 1.23183 2.10879 0.00050099
  
  Natural effects on RR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    1.31391   0.07090 1.18203 1.46049 4.2158e-07
  Indirect effect  0.92786   0.02211 0.88552 0.97223  0.0016799
  Total effect     1.21912   0.06962 1.09003 1.36350  0.0005212
  
  Natural effects on RD scale: 
  
                  Estimate Std.error     2.5%    97.5%      P.val
  Direct effect    0.16525   0.03184  0.10286  0.22765 2.0946e-07
  Indirect effect -0.04990   0.01579 -0.08084 -0.01896 0.00157225
  Total effect     0.11536   0.03280  0.05107  0.17964 0.00043667
  
  Controlled direct effect (m=0): 
  
           Estimate Std.error    2.5%   97.5%      P.val
  OR scale  2.61643   0.50061 1.79824 3.80690 4.9841e-07
  RR scale  1.40827   0.09257 1.23803 1.60191 1.9076e-07
  RD scale  0.21668   0.04026 0.13777 0.29559 7.3675e-08
  
  Controlled direct effect (m=1): 
  
           Estimate Std.error     2.5%   97.5%   P.val
  OR scale  1.37503   0.28624  0.91436 2.06781 0.12605
  RR scale  1.15094   0.10884  0.95622 1.38531 0.13713
  RD scale  0.07836   0.05129 -0.02217 0.17889 0.12656
  
  
  Mediator model: 
  
  
  Call:
  glm(formula = Mform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept)  -0.8950     0.1615  -5.540 3.02e-08 ***
  X             1.3943     0.1656   8.419  < 2e-16 ***
  C1b          -0.5073     0.1934  -2.623  0.00871 ** 
  C1c          -0.2790     0.1906  -1.463  0.14334    
  C2            1.5664     0.1109  14.128  < 2e-16 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1359.28  on 999  degrees of freedom
  Residual deviance:  983.59  on 995  degrees of freedom
  AIC: 993.59
  
  Number of Fisher Scoring iterations: 5
  
  
  Outcome model: 
  
  
  Call:
  glm(formula = Yform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept)  0.25324    0.15193   1.667   0.0956 .  
  X            0.96181    0.19133   5.027 4.98e-07 ***
  M           -0.04639    0.21774  -0.213   0.8313    
  X:M         -0.64333    0.28142  -2.286   0.0223 *  
  C1b          0.00162    0.16759   0.010   0.9923    
  C1c         -0.13997    0.16069  -0.871   0.3837    
  C2          -0.20336    0.07922  -2.567   0.0103 *  
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1330.1  on 999  degrees of freedom
  Residual deviance: 1279.5  on 993  degrees of freedom
  AIC: 1293.5
  
  Number of Fisher Scoring iterations: 4

Case-control data

exactmed() can also compute mediation effects with a binary outcome and a binary mediator when the data come from a classical case-control study wherein the probability of being selected only depends on the outcome status. To do so, the true outcome prevalence (that is, the population prevalence \(P(Y = hvalue\_y))\) must be known and the yprevalence parameter set to this value. exactmed() accounts for the ascertainment in the sample by employing weighted regression techniques that use inverse-probability weighting (IPW) with robust standard errors (see details in the documentation).

The following call to exactmed() returns mediation effects supposing that the data have been obtained from a case-control study and that the true outcome prevalence is \(0.1\):

> 
> results12 <- exactmed(
+   data = datamed, a = 'X', m = 'M', y = 'Y', 
+   a1 = 1, a0 = 0, interaction = FALSE, yprevalence = 0.1
+   )
  'exactmed' will compute unadjusted natural effects
> 
> results12
  
  Natural effects on OR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    2.19582   0.31893 1.65183 2.91896 6.1132e-08
  Indirect effect  0.82180   0.04229 0.74295 0.90901 0.00013694
  Total effect     1.80452   0.24723 1.37956 2.36039 1.6439e-05
  
  Natural effects on RR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    2.01152   0.25650 1.56670 2.58265 4.2319e-08
  Indirect effect  0.84501   0.03675 0.77597 0.92019 0.00010767
  Total effect     1.69975   0.20855 1.33643 2.16184 1.5354e-05
  
  Natural effects on RD scale: 
  
                  Estimate Std.error     2.5%    97.5%      P.val
  Direct effect    0.07750   0.01607  0.04601  0.10900 1.4134e-06
  Indirect effect -0.02389   0.00700 -0.03760 -0.01018 0.00063914
  Total effect     0.05361   0.01309  0.02796  0.07927 4.2084e-05
  
  Controlled direct effect (m=0): 
  
           Estimate Std.error    2.5%   97.5%      P.val
  OR scale  2.20913   0.32572 1.65469 2.94934 7.6324e-08
  RR scale  1.99136   0.24919 1.55824 2.54488 3.7019e-08
  RD scale  0.08966   0.01974 0.05097 0.12835 5.5656e-06
  
  Controlled direct effect (m=1): 
  
           Estimate Std.error    2.5%   97.5%      P.val
  OR scale  2.20913   0.32572 1.65469 2.94934 7.6324e-08
  RR scale  2.08516   0.28792 1.59077 2.73322 1.0270e-07
  RD scale  0.05335   0.00992 0.03390 0.07281 7.5963e-08
  
  
  Mediator model: 
  
  
  z test of coefficients:
  
              Estimate Std. Error z value  Pr(>|z|)    
  (Intercept) -0.68616    0.13294 -5.1614 2.451e-07 ***
  X            1.27375    0.19468  6.5428 6.037e-11 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  
  Outcome model: 
  
  
  z test of coefficients:
  
               Estimate Std. Error  z value  Pr(>|z|)    
  (Intercept) -2.308276   0.099612 -23.1727 < 2.2e-16 ***
  X            0.792597   0.147443   5.3756 7.632e-08 ***
  M           -0.653826   0.149353  -4.3777 1.199e-05 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Of note, the same optional parameters described in the previous sections are available in the case-control study context.

Mediation analysis with a continuous mediator

As mentioned in the introduction, in the case of a continuous mediator, the ExactMed package allows the user to obtain estimates of the different mediation effects using the exactmed_c() function, which essentially offers the same options as exactmed(). The only difference is the absence of the hvalue_m parameter and the addition of the mf parameter, the latter allowing to set the value of the mediator in the calculation of the controlled direct effect (by default fixed at the sample-specific mean of the mediator).

For illustration, the package also makes available to the user the datamed_c data set containing a continuous mediator variable. Some of the features of this data set can be found in its corresponding help file (help(datamed_c)). We recall that the exactmed_c() function only works on data frames with named columns and no missing values.

> 
> library(ExactMed)
> 
> head(datamed_c)
    X          M Y C1         C2
  1 1 -0.6559769 1  0  0.3753731
  2 1  0.8445136 1  1  0.1971635
  3 1 -1.9282753 1  1 -0.5971041
  4 0  0.9826885 0  0  0.7576990
  5 0 -0.5505834 0  0  1.2056864
  6 0  0.3611274 1  0 -0.5983204

We provide below an example of call to exactmed_c() that allows to obtain estimates of conditional mediation effects supposing no exposure-mediator interaction in the outcome regression model:

> 
> results13 <- exactmed_c(
+   data = datamed_c, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,  
+   m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'), 
+   interaction = FALSE
+   )
> 
> results13
  
  Natural effects on OR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    2.39946   0.34660 1.80783 3.18470  1.368e-09
  Indirect effect  1.32343   0.07327 1.18734 1.47512  4.159e-07
  Total effect     3.17552   0.43497 2.42784 4.15345 < 2.22e-16
  
  Natural effects on RR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    1.52926   0.10643 1.33427 1.75275 1.0366e-09
  Indirect effect  1.10184   0.02380 1.05617 1.14948 7.1014e-06
  Total effect     1.68500   0.10868 1.48491 1.91205 6.6613e-16
  
  Natural effects on RD scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    0.21520   0.03435 0.14788 0.28252 3.7132e-10
  Indirect effect  0.06332   0.01308 0.03768 0.08896 1.2951e-06
  Total effect     0.27853   0.03130 0.21718 0.33987 < 2.22e-16
  
  Controlled direct effect (m=-0.5107948): 
  
           Estimate Std.error    2.5%   97.5%      P.val
  OR scale  2.46943   0.35947 1.85647 3.28476 5.2965e-10
  RR scale  1.50025   0.10062 1.31545 1.71100 1.4658e-09
  RD scale  0.21993   0.03428 0.15273 0.28712 1.4081e-10
  
  
  Mediator model: 
  
  
  Call:
  lm(formula = Mform, data = data)
  
  Residuals:
      Min      1Q  Median      3Q     Max 
  -6.1821 -1.4324  0.0568  1.2604  7.1676 
  
  Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
  (Intercept) -0.62373    0.10799  -5.776 1.02e-08 ***
  X            1.53357    0.12476  12.292  < 2e-16 ***
  C1          -1.22692    0.12473  -9.837  < 2e-16 ***
  C2           1.61857    0.06236  25.956  < 2e-16 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  Residual standard error: 1.971 on 996 degrees of freedom
  Multiple R-squared:  0.4774,  Adjusted R-squared:  0.4758 
  F-statistic: 303.3 on 3 and 996 DF,  p-value: < 2.2e-16
  
  
  Outcome model: 
  
  
  Call:
  glm(formula = Yform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept) -0.77921    0.12238  -6.367 1.93e-10 ***
  X            0.90399    0.14557   6.210 5.30e-10 ***
  M            0.18828    0.03601   5.228 1.71e-07 ***
  C1           1.26049    0.14987   8.410  < 2e-16 ***
  C2          -0.44883    0.09123  -4.920 8.67e-07 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1377.4  on 999  degrees of freedom
  Residual deviance: 1218.1  on 995  degrees of freedom
  AIC: 1228.1
  
  Number of Fisher Scoring iterations: 4

To perform an adjusted mediation analysis allowing for exposure-mediator interaction, using bootstrap based on \(100\) resamples with initial random seed \(= 1885\) to construct \(95\%\) confidence intervals and computing the controlled direct effect when the mediator is set at the value \(2\), one should call exactmed_c() as follows:

> 
> results14 <- exactmed_c(
+   data = datamed_c, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0, 
+   m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'), 
+   boot = TRUE, nboot = 100, bootseed = 1885, confcoef = 0.95,
+   mf = 2
+   )
> 
> results14
  
  Natural effects on OR scale: 
  
                  Estimate Std.error    2.5%   97.5%
  Direct effect    4.31959   0.77257 3.06977 6.12301
  Indirect effect  0.82693   0.06038 0.72646 0.95294
  Total effect     3.57197   0.54375 2.82364 4.80002
  
  Natural effects on RR scale: 
  
                  Estimate Std.error    2.5%   97.5%
  Direct effect    1.84144   0.12202 1.62966 2.11563
  Indirect effect  0.94962   0.01716 0.92646 0.98644
  Total effect     1.74867   0.11280 1.56681 2.01742
  
  Natural effects on RD scale: 
  
                  Estimate Std.error     2.5%    97.5%
  Direct effect    0.34112   0.03489  0.27026  0.41113
  Indirect effect -0.03761   0.01341 -0.05775 -0.00970
  Total effect     0.30351   0.03150  0.24864  0.36907
  
  Controlled direct effect (m=2): 
  
           Estimate Std.error     2.5%    97.5%
  OR scale  0.60828   0.12292  0.43865  0.81843
  RR scale  0.86843   0.04549  0.78712  0.94488
  RD scale -0.10062   0.03721 -0.16694 -0.04072
  
  First bootstrap replications of natural effects on OR scale: 
  
       Direct effect Indirect effect Total effect
  [1,]      4.317548       0.8193258     3.537479
  [2,]      2.967199       0.8876172     2.633736
  [3,]      4.949753       0.7552281     3.738193
  
  First bootstrap replications of natural effects on RR scale: 
  
       Direct effect Indirect effect Total effect
  [1,]      1.793859       0.9498774     1.703946
  [2,]      1.616215       0.9618525     1.554560
  [3,]      1.963302       0.9267452     1.819480
  
  First bootstrap replications of natural effects on RD scale: 
  
       Direct effect Indirect effect Total effect
  [1,]     0.3366462     -0.03812869    0.2985175
  [2,]     0.2618394     -0.02619802    0.2356414
  [3,]     0.3709888     -0.05538875    0.3156000
  
  First bootstrap replications of controlled direct effect on OR scale: 
  
  [1] 0.4826180 0.4767809 0.6314553
  
  First bootstrap replications of controlled direct effect on RR scale: 
  
  [1] 0.8329215 0.8063200 0.8673280
  
  First bootstrap replications of controlled direct effect on RD scale: 
  
  [1] -0.13581560 -0.15128670 -0.09790019
  
  
  Mediator model: 
  
  
  Call:
  lm(formula = Mform, data = data)
  
  Residuals:
      Min      1Q  Median      3Q     Max 
  -6.1821 -1.4324  0.0568  1.2604  7.1676 
  
  Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
  (Intercept) -0.62373    0.10799  -5.776 1.02e-08 ***
  X            1.53357    0.12476  12.292  < 2e-16 ***
  C1          -1.22692    0.12473  -9.837  < 2e-16 ***
  C2           1.61857    0.06236  25.956  < 2e-16 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  Residual standard error: 1.971 on 996 degrees of freedom
  Multiple R-squared:  0.4774,  Adjusted R-squared:  0.4758 
  F-statistic: 303.3 on 3 and 996 DF,  p-value: < 2.2e-16
  
  
  Outcome model: 
  
  
  Call:
  glm(formula = Yform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept) -0.51704    0.13059  -3.959 7.52e-05 ***
  X            0.74814    0.15871   4.714 2.43e-06 ***
  M            0.49722    0.05253   9.465  < 2e-16 ***
  X:M         -0.62263    0.06358  -9.793  < 2e-16 ***
  C1           1.39342    0.16191   8.606  < 2e-16 ***
  C2          -0.53693    0.09816  -5.470 4.50e-08 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1377.4  on 999  degrees of freedom
  Residual deviance: 1102.9  on 994  degrees of freedom
  AIC: 1114.9
  
  Number of Fisher Scoring iterations: 4

Mediation analysis with a categorical mediator

As mentioned in the introduction, in the case of a categorical mediator (coded as factor), the ExactMed package allows the user to obtain estimates of mediation effects through the exactmed_cat() function, which basically offers the same options as exactmed(). The only difference is the absence of the hvalue_m parameter and the addition of two extra parameters: blevel_m and mf. The first one allows to set the reference level of the mediator, which by default corresponds to the first level of the corresponding factor variable. The second parameter allows to specify the level of the mediator in the calculation of the controlled direct effect. Parameter blevel_m will thus impact the mediator regression model and associated output by fixing the reference level of the dependent variable. Parameter blevel_m will not impact the values of the natural effects and will impact the controlled direct effect only if the value of the parameter mf is not specified by the user. In this case, the value of the parameter mf will by default correspond to the value of parameter blevel_m.

For illustration, the package also makes available to the user the datamed_cat data set containing a categorical mediator variable. Some of the features of this data set can be found in its corresponding help file (help(datamed_cat)). We recall that the exactmed_cat() function only works on data frames with named columns and no missing values.

> 
> head(datamed_cat)
    X M Y C1         C2
  1 1 c 1  0  0.3753731
  2 1 d 1  1  0.1971635
  3 1 b 1  1 -0.5971041
  4 0 d 0  0  0.7576990
  5 0 c 0  0  1.2056864
  6 0 d 1  0 -0.5983204

We provide below an example of call to exactmed_cat() to obtain estimates of conditional mediation effects supposing no exposure-mediator interaction in the outcome regression model:

> 
> results15 <- exactmed_cat(
+   data = datamed_cat, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0,  
+   m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'), 
+   interaction = FALSE
+   )
> 
> results15
  
  Natural effects on OR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    2.37922   0.33703 1.80242 3.14061 9.4267e-10
  Indirect effect  1.39828   0.09855 1.21787 1.60540 1.9687e-06
  Total effect     3.32681   0.45882 2.53882 4.35936 < 2.22e-16
  
  Natural effects on RR scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    1.51375   0.10234 1.32589 1.72822 8.6512e-10
  Indirect effect  1.11869   0.02906 1.06315 1.17713 1.5812e-05
  Total effect     1.69342   0.10787 1.49465 1.91862 2.2204e-16
  
  Natural effects on RD scale: 
  
                  Estimate Std.error    2.5%   97.5%      P.val
  Direct effect    0.21297   0.03366 0.14700 0.27894 2.4890e-10
  Indirect effect  0.07448   0.01621 0.04270 0.10626 4.3589e-06
  Total effect     0.28745   0.03118 0.22634 0.34856 < 2.22e-16
  
  Controlled direct effect (m='a'): 
  
           Estimate Std.error    2.5%   97.5%      P.val
  OR scale  2.52115   0.36610 1.89669 3.35121 1.9134e-10
  RR scale  1.86361   0.17886 1.54405 2.24932 8.8040e-11
  RD scale  0.20031   0.03548 0.13078 0.26985 1.6391e-08
  
  
  Mediator model: 
  
  
  Call:
  mlogit(formula = M ~ 1 | X + C1 + C2, data = data_long, reflevel = ref_level, 
      method = "nr")
  
  Frequencies of alternatives:choice
    a   b   c   d   e 
  0.2 0.2 0.2 0.2 0.2 
  
  nr method
  6 iterations, 0h:0m:0s 
  g'(-H)^-1g = 2.06E-05 
  successive function values within tolerance limits 
  
  Coefficients :
                 Estimate Std. Error z-value  Pr(>|z|)    
  (Intercept):b  0.539165   0.202979  2.6563  0.007901 ** 
  (Intercept):c  0.622236   0.209098  2.9758  0.002922 ** 
  (Intercept):d  0.435369   0.217929  1.9978  0.045743 *  
  (Intercept):e -0.089258   0.244928 -0.3644  0.715541    
  X:b            1.024819   0.228423  4.4865 7.241e-06 ***
  X:c            1.688852   0.247460  6.8248 8.808e-12 ***
  X:d            2.222046   0.259408  8.5658 < 2.2e-16 ***
  X:e            2.750817   0.286159  9.6129 < 2.2e-16 ***
  C1:b          -0.654903   0.220269 -2.9732  0.002947 ** 
  C1:c          -1.167718   0.237979 -4.9068 9.257e-07 ***
  C1:d          -1.537355   0.248169 -6.1948 5.836e-10 ***
  C1:e          -2.271721   0.276774 -8.2079 2.220e-16 ***
  C2:b           0.828863   0.135509  6.1167 9.555e-10 ***
  C2:c           1.662424   0.156101 10.6497 < 2.2e-16 ***
  C2:d           1.997098   0.164942 12.1079 < 2.2e-16 ***
  C2:e           2.934705   0.188921 15.5341 < 2.2e-16 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  Log-Likelihood: -1325.6
  McFadden R^2:  0.17635 
  Likelihood ratio test : chisq = 567.66 (p.value = < 2.22e-16)
  
  Outcome model: 
  
  
  Call:
  glm(formula = Yform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept) -1.82245    0.21526  -8.466  < 2e-16 ***
  X            0.92471    0.14521   6.368 1.91e-10 ***
  Mb           0.75708    0.22720   3.332 0.000861 ***
  Mc           1.32271    0.24801   5.333 9.64e-08 ***
  Md           1.19699    0.25554   4.684 2.81e-06 ***
  Me           1.44786    0.28800   5.027 4.97e-07 ***
  C1           1.24728    0.14993   8.319  < 2e-16 ***
  C2          -0.42296    0.08824  -4.793 1.64e-06 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1377.4  on 999  degrees of freedom
  Residual deviance: 1211.0  on 992  degrees of freedom
  AIC: 1227
  
  Number of Fisher Scoring iterations: 4

To perform an adjusted mediation analysis allowing for exposure-mediator interaction, using bootstrap based on \(100\) resamples with initial random seed \(= 1875\) to construct \(95\%\) confidence intervals and computing the controlled direct effect at the level ‘c’ of the mediator, one should call exactmed_cat() as follows:

> 
> results16 <- exactmed_cat(
+   data = datamed_cat, a = 'X', m = 'M', y = 'Y', a1 = 1, a0 = 0, 
+   m_cov = c('C1', 'C2'), y_cov = c('C1', 'C2'), 
+   boot = TRUE, nboot = 100, bootseed = 1875, confcoef = 0.95,
+   mf = 'c'
+   )
> 
> results16
  
  Natural effects on OR scale: 
  
                  Estimate Std.error    2.5%   97.5%
  Direct effect    4.41078   0.80943 3.36925 6.33146
  Indirect effect  0.75882   0.07040 0.60896 0.87894
  Total effect     3.34699   0.47761 2.62808 4.35731
  
  Natural effects on RR scale: 
  
                  Estimate Std.error    2.5%   97.5%
  Direct effect    1.85089   0.13401 1.63145 2.11986
  Indirect effect  0.92653   0.02200 0.88034 0.96343
  Total effect     1.71491   0.11612 1.51320 1.94403
  
  Natural effects on RD scale: 
  
                  Estimate Std.error     2.5%    97.5%
  Direct effect    0.34503   0.03611  0.28452  0.41486
  Indirect effect -0.05514   0.01765 -0.09385 -0.02648
  Total effect     0.28989   0.03127  0.23214  0.34706
  
  Controlled direct effect (m='c'): 
  
           Estimate Std.error    2.5%   97.5%
  OR scale  2.23556   1.08567 1.36293 4.23663
  RR scale  1.33131   0.18071 1.10223 1.62581
  RD scale  0.18213   0.06992 0.06808 0.30585
  
  First bootstrap replications of natural effects on OR scale: 
  
       Direct effect Indirect effect Total effect
  [1,]      3.766006       0.8060753     3.035684
  [2,]      4.613071       0.6838449     3.154625
  [3,]      3.624178       0.8161027     2.957701
  
  First bootstrap replications of natural effects on RR scale: 
  
       Direct effect Indirect effect Total effect
  [1,]      1.745950       0.9390725     1.639573
  [2,]      1.857263       0.9011499     1.673672
  [3,]      1.679460       0.9448718     1.586874
  
  First bootstrap replications of natural effects on RD scale: 
  
       Direct effect Indirect effect Total effect
  [1,]     0.3120241     -0.04449625    0.2675279
  [2,]     0.3520571     -0.07539621    0.2766609
  [3,]     0.2998179     -0.04085426    0.2589637
  
  First bootstrap replications of controlled direct effect on OR scale: 
  
  [1] 1.770563 1.379440 1.611341
  
  First bootstrap replications of controlled direct effect on RR scale: 
  
  [1] 1.229423 1.122744 1.184028
  
  First bootstrap replications of controlled direct effect on RD scale: 
  
  [1] 0.13105003 0.07395983 0.10863869
  
  
  Mediator model: 
  
  Call:
  multinom(formula = Mform, data = data, trace = FALSE)
  
  Coefficients:
    (Intercept)        X         C1        C2
  b  0.53916078 1.024813 -0.6549007 0.8288642
  c  0.62222777 1.688853 -1.1677146 1.6624211
  d  0.43535996 2.222050 -1.5373555 1.9970966
  e -0.08927925 2.750826 -2.2717164 2.9347141
  
  Std. Errors:
    (Intercept)         X        C1        C2
  b   0.2029782 0.2284233 0.2202692 0.1355087
  c   0.2090972 0.2474595 0.2379791 0.1561005
  d   0.2179288 0.2594080 0.2481688 0.1649421
  e   0.2449284 0.2861596 0.2767736 0.1889210
  
  Residual Deviance: 2651.22 
  AIC: 2683.22 
  
  Outcome model: 
  
  
  Call:
  glm(formula = Yform, family = binomial(link = "logit"), data = data)
  
  Coefficients:
              Estimate Std. Error z value Pr(>|z|)    
  (Intercept) -2.67124    0.27264  -9.798  < 2e-16 ***
  X            3.81300    0.52911   7.206 5.75e-13 ***
  Mb           1.06895    0.31187   3.428 0.000609 ***
  Mc           2.19709    0.33371   6.584 4.59e-11 ***
  Md           2.78578    0.36477   7.637 2.22e-14 ***
  Me           3.37526    0.40984   8.236  < 2e-16 ***
  X:Mb        -1.69550    0.63866  -2.655 0.007936 ** 
  X:Mc        -3.00850    0.61553  -4.888 1.02e-06 ***
  X:Md        -4.11908    0.61667  -6.680 2.40e-11 ***
  X:Me        -4.49990    0.62393  -7.212 5.50e-13 ***
  C1           1.34154    0.16052   8.358  < 2e-16 ***
  C2          -0.48389    0.09375  -5.162 2.45e-07 ***
  ---
  Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  
  (Dispersion parameter for binomial family taken to be 1)
  
      Null deviance: 1377.4  on 999  degrees of freedom
  Residual deviance: 1112.0  on 988  degrees of freedom
  AIC: 1136
  
  Number of Fisher Scoring iterations: 5

One can note from the previous output that the reference level for the mediator model is by default the first level of the mediator factor variable (blevel_m = 'a'). However, the controlled direct effect is computed at the level ‘c’ of the categorical mediator, as requested by the parameter mf (that is, mf = 'c').