--- title: "R package cases: overview" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{R package cases: overview} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- The goal of is this vignette is to illustrate the R package **cases** by some elementary code examples. ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", out.width = "100%" ) ``` ## Preparation Load the package: ```{r setup} library(cases) ``` ## Important functions ### categorize() Often, binary predictions are not readily available but rather need to be derived from continuous (risk) scores. This can be done via the categorize function. ```{r categorize1} # real data example from publication here set.seed(123) M <- as.data.frame(mvtnorm::rmvnorm(10, mean = rep(0, 3), sigma = 2 * diag(3))) M ## categorize at 0 by default yhat <- categorize(M) yhat ## define multiple cutpoints to define multiple decision rules per marker C <- c(0, 1, 0, 1, 0, 1) a <- c(1, 1, 2, 2, 3, 3) categorize(M, C, a) ## this can even be used to do multi-class classification, like this: C <- matrix(rep(c(-1, 0, 1, -2, 0, 2), 3), ncol = 3, byrow = TRUE) C categorize(M, C, a) ``` ### compare() In supervised classification, it is assumed that we have a true set of labels. In medical testing, this is usually called the reference standard provided by an established diagnostic/prognostic tool. We need to compare model predictions against these labels in order to compute model accuracy. ```{r compare1} ## consider binary prediction from 3 models from previous r chunk names(yhat) <- paste0("rule", 1:ncol(yhat)) yhat ## assume true labels y <- c(rep(1, 5), rep(0, 5)) ## compare then results in compare(yhat, y) ``` ### evaluate() Main function of the package ```{r evaluate1} evaluate(compare(yhat, y)) ``` More details on the dta function are provided in the last section ### draw_data() cases includes a few functions for synthetic data generation ```{r draw_data1} draw_data_lfc(n = 20) ``` ```{r draw_data2} draw_data_roc(n = 20) ``` Remark: Synthetic data comes at the 'compared' level meaning the labels 1 and 0 indicate correct and false predictions, respectively. No need to compare() in addition. ## Common workflows The pipe operator '%>%' allows us to chain together subsequent operations in R. This is useful, as the dta function expects preprocessed data indicating correct (1) and false (0) predictions. ```{r workflow1} M %>% categorize() %>% compare(y) %>% evaluate() ``` ## Multiple testing for co-primary endpoints ### Specification of hypotheses The R command ```{r dtafun1, eval=FALSE} ?evaluate ``` gives an overview over the function arguments of the evaluate function. - comparator defines one of the classification rules under consideration to be the primary comparator - benchmark is a pre-defined accuracy categorize for each subgroup Together this implies the hypotheses system that is considered, namely $H_0: \forall g \forall j: \theta_j^g \leq \theta_0^g$ In the application of primary interest, diagnostic accuracy studies, this simplifies to $G=2$ with $\theta_1 = Se$ and $\theta_2 =Sp$ indicating sensitivity and specificity of a medical test or classication rule. In this case we aim to reject the global null hypothesis $H_0: \forall j: Se_j \leq Se_0 \wedge Sp_j \leq Sp_0$ ### Comparison vs. confidence regions In the following, we highlight the difference between the "co-primary" analysis (comparison regions) and a "full" analysis (confidence regions). ```{r} set.seed(1337) data <- draw_data_roc( n = 120, prev = c(0.25, 0.75), m = 4, delta = 0.05, e = 10, auc = seq(0.90, 0.95, 0.025), rho = c(0.25, 0.25) ) lapply(data, head) ``` ```{r viz_comp} ## comparison regions results_comp <- data %>% evaluate( alternative = "greater", alpha = 0.025, benchmark = c(0.7, 0.8), analysis = "co-primary", regu = TRUE, adj = "maxt" ) visualize(results_comp) ``` ```{r} ## confidence regions results_conf <- data %>% evaluate( alternative = "greater", alpha = 0.025, benchmark = c(0.7, 0.8), analysis = "full", regu = TRUE, adj = "maxt" ) visualize(results_conf) ``` As we can see, the comparison regions are more liberal compared to the confidence regions. ## Real data example A second vignette shows an application of the cases package to the Breast Cancer Wisconsin Diagnostic (wdbc) data set. ```{r example_wdbc, eval=FALSE, echo=TRUE} vignette("example_wdbc", "cases") ``` ## References 1. Westphal M, Zapf A. Statistical inference for diagnostic test accuracy studies with multiple comparisons. Statistical Methods in Medical Research. 2024;0(0). [doi:10.1177/09622802241236933](https://journals.sagepub.com/doi/full/10.1177/09622802241236933)