--- title: "Storing and Analyzing Imputed Data with rbmiUtils" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true toc_depth: 2 number_sections: true vignette: > %\VignetteIndexEntry{Storing and Analyzing with rbmiUtils} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # Introduction This vignette demonstrates how to: * Perform multiple imputation using the `{rbmi}` package. * Store and modify the imputed data using `{rbmiUtils}`. * Analyze the imputed data using: * A standard ANCOVA on a continuous endpoint (`CHG`) * A binary responder analysis on `CRIT1FLN` using `{beeca}` This pattern enables reproducible workflows where imputation and analysis can be separated and revisited independently. # Statistical Context This approach applies **Rubin’s Rules** for inference after multiple imputation: > We fit a model to each imputed dataset, dervive a response variable on the CHG score, extract marginal effects or other statistics of interest, and combine the results into a single inference using Rubin’s combining rules. --- # Step 1: Setup and Data Preparation ```{r libraries, message = FALSE, warning = FALSE} library(dplyr) library(tidyr) library(readr) library(purrr) library(rbmi) library(beeca) library(rbmiUtils) ``` ```{r seed} set.seed(1974) ``` ```{r load-data} data("ADEFF") ADEFF <- ADEFF %>% mutate( TRT = factor(TRT01P, levels = c("Placebo", "Drug A")), USUBJID = factor(USUBJID), AVISIT = factor(AVISIT) ) ``` # Step 2: Define Imputation Model ```{r define-vars} vars <- set_vars( subjid = "USUBJID", visit = "AVISIT", group = "TRT", outcome = "CHG", covariates = c("BASE", "STRATA", "REGION") ) ``` ```{r define-method} method <- method_bayes( n_samples = 100, control = control_bayes(warmup = 200, thin = 2) ) ``` ```{r impute} dat <- ADEFF %>% select(USUBJID, STRATA, REGION, REGIONC, TRT, BASE, CHG, AVISIT) draws_obj <- draws(data = dat, vars = vars, method = method) impute_obj <- impute(draws_obj, references = c("Placebo" = "Placebo", "Drug A" = "Placebo")) ADMI <- get_imputed_data(impute_obj) ``` # Step 3: Add Responder Variables ```{r derive-responder} ADMI <- ADMI %>% mutate( CRIT1FLN = ifelse(CHG > 3, 1, 0), CRIT1FL = ifelse(CRIT1FLN == 1, "Y", "N"), CRIT = "CHG > 3" ) ``` # Step 4: Continuous Endpoint Analysis (CHG) ```{r analyse-chg} ana_obj_ancova <- analyse_mi_data( data = ADMI, vars = vars, method = method, fun = ancova ) ``` ```{r pool-chg} pool_obj_ancova <- pool(ana_obj_ancova) print(pool_obj_ancova) ``` ```{r tidy-chg} tidy_pool_obj(pool_obj_ancova) ``` # Step 5: Responder Endpoint Analysis (CRIT1FLN) ## Define Analysis Function ```{r gcomp-fun} gcomp_responder <- function(data, ...) { model <- glm(CRIT1FLN ~ TRT + BASE + STRATA + REGION, data = data, family = binomial) marginal_fit <- get_marginal_effect( model, trt = "TRT", method = "Ge", type = "HC0", contrast = "diff", reference = "Placebo" ) res <- marginal_fit$marginal_results list( trt = list( est = res[res$STAT == "diff", "STATVAL"][[1]], se = res[res$STAT == "diff_se", "STATVAL"][[1]], df = NA ) ) } ``` ## Define Variables and Run Analysis ```{r vars-binary} vars_binary <- set_vars( subjid = "USUBJID", visit = "AVISIT", group = "TRT", outcome = "CRIT1FLN", covariates = c("BASE", "STRATA", "REGION") ) ``` ```{r analyse-binary} ana_obj_prop <- analyse_mi_data( data = ADMI, vars = vars_binary, method = method, fun = gcomp_responder ) ``` ```{r pool-binary} pool_obj_prop <- pool(ana_obj_prop) print(pool_obj_prop) ``` # Final Notes * The `ADMI` object can be saved for later reuse. * Analyses can be modularly applied using custom functions. * The tidy output from `tidy_pool_obj()` is helpful for reporting and review.