#make_cate# Predicting conditional average treatment effect (CATE) on a new policy based on the training over an old policy
##Description## make_cate is a function generating conditional average treatment effect (CATE) for both a training dataset and a testing (or new) dataset related to a binary (treated vs. untreated) policy program. It provides the main input for running opl_tb (optimal policy learning of a threshold-based policy), opl_tb_c (optimal policy learning of a threshold-based policy at specific thresholds), opl_lc (optimal policy learning of a linear-combination policy), opl_lc_c (optimal policy learning of a linear-combination policy at specific parameters), opl_dt (optimal policy learning of a decision-tree policy), opl_dt_c (optimal policy learning of a decision-tree policy at specific thresholds and selection variables). Based on Kitagawa and Tetenov (2018), the main econometrics supported by these commands can be found in Cerulli (2022).
##Function Syntax##
make_cate(model, train_data, test_data, w, x, y, family = gaussian(), ntree = 100, mtry = 2)
##Arguments:## - model: A string indicating the model to use. Valid options are “glm” and “rf”. - train_data: The training dataset used for estimating the treatment effect on the old policy. - test_data: The test dataset used for estimating the treatment effect on the new policy. - w: The treatment variable. - x: Independent variables for the model. - y: The outcome variable. - family: The family type for the model (e.g., “binomial”, “gaussian”). - ntree: Number of trees for the Random Forest model (default is 100).
##Return Value## An object containing the estimated causal treatment effect results, including: - Average Treatment Effect (ATE) - Average Treatment Effect on Treated (ATET) - Average Treatment Effect on Non-Treated (ATENT)
##Example Usage##
set.seed(42)
train_data <- data.frame(
y = rnorm(100), # Outcome
x1 = rnorm(100), # Covariate
x2 = rnorm(100),
w = sample(0:1, 100, replace = TRUE) # Trattamento
)
test_data <- data.frame(
y = rnorm(100), # Outcome
x1 = rnorm(100), # Covariate
x2 = rnorm(100),
w = sample(0:1, 100, replace = TRUE) # Trattamento
)
x <- c("x1", "x2") # le covariate
y <- "y" # la variabile dipendente
w <- "w" # la variabile di trattamento
family <- "gaussian" # Famiglia per glm
ntree <- 100 # Numero di alberi per random forest
mtry <- 2 # Numero di variabili da considerare in ogni split
result <- make_cate(model = "glm", train_data = train_data, test_data = test_data, w = w, x = x, y = y)
#> --------------------------------------------------
#> - Treatment-effects estimation: OLD POLICY -
#> --------------------------------------------------
#> Number of obs = 100
#> Estimator = regression adjustment
#> Outcome model = linear
#> DIM = 0.0705516
#> ATE = 0.0490622
#> ATET = 0.0889665
#> ATENT = 0.0399044
#> --------------------------------------------------
#> - Treatment-effects estimation: NEW POLICY -
#> --------------------------------------------------
#> Number of obs = 100
#> Estimator = regression adjustment
#> Outcome model = linear
#> DIM = 0.3071428
#> ATE = 0.0196866
#> ATET = 0.0887607
#> ATENT = 0.0690741
##Detailed Steps:## - Train and Test Data: The function separates the data into treated and untreated groups for both the training and test datasets. - Model Estimation: The function estimates the treatment effect using either a GLM or Random Forest model. It then calculates the predicted outcome for both the treated and untreated groups. - Causal Effect Calculation: The CATE is calculated as the difference in predicted outcomes between the treated and untreated groups. - Output: The function returns the estimated treatment effects (ATE, ATET, ATENT) and the treatment effects for both the training and test datasets.
##Results## The following results are output for both the old (training) and new (test) policy:
- Treatment-effects estimation: OLD POLICY - |
---|
Number of obs = [n] Estimator = regression adjustment Outcome model = linear DIM = [DIM_value] ATE = [ATE_value] ATET = [ATET_value] ATENT = [ATENT_value] |
##Conclusion## The make_cate function is a powerful tool for estimating the causal treatment effect of a policy using either GLM or Random Forest models. It provides insights into the treatment effects both for the old and new policies, making it a useful method for causal inference in policy analysis.