| Type: | Package |
| Title: | Evidence Accumulation Models |
| Version: | 1.0.1 |
| LinkingTo: | Rcpp |
| Imports: | Rcpp, dplyr, tidyr, arrow, rlang, distributional, stats, parallel, codetools, grDevices, graphics, ggplot2, gridExtra, data.table, purrr, scales |
| Suggests: | testthat (≥ 3.0.0), pbapply, abc |
| Description: | Simulation-based evidence accumulation models for analyzing responses and reaction times in single- and multi-response tasks. The package includes simulation engines for five representative models: the Diffusion Decision Model (DDM), Leaky Competing Accumulator (LCA), Linear Ballistic Accumulator (LBA), Racing Diffusion Model (RDM), and Levy Flight Model (LFM), and extends these frameworks to multi-response settings. The package supports user-defined functions for item-level parameterization and the incorporation of covariates, enabling flexible customization and the development of new model variants based on existing architectures. Inference is performed using simulation-based methods, including Approximate Bayesian Computation (ABC) and Amortized Bayesian Inference (ABI), which allow parameter estimation without requiring tractable likelihood functions. In addition to core inference tools, the package provides modules for parameter recovery, posterior predictive checks, and model comparison, facilitating the study of a wide range of cognitive processes in tasks involving perceptual decision making, memory retrieval, and value-based decision making. Key methods implemented in the package are described in Ratcliff (1978) <doi:10.1037/0033-295X.85.2.59>, Usher and McClelland (2001) <doi:10.1037/0033-295X.108.3.550>, Brown and Heathcote (2008) <doi:10.1016/j.cogpsych.2007.12.002>, Tillman, Van Zandt and Logan (2020) <doi:10.3758/s13423-020-01719-6>, Wieschen, Voss and Radev (2020) <doi:10.20982/tqmp.16.2.p120>, Csilléry, François and Blum (2012) <doi:10.1111/j.2041-210X.2011.00179.x>, Beaumont (2019) <doi:10.1146/annurev-statistics-030718-105212>, and Sainsbury-Dale, Zammit-Mangion and Huser (2024) <doi:10.1080/00031305.2023.2249522>. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| Config/testthat/edition: | 3 |
| RoxygenNote: | 7.3.3 |
| Depends: | R (≥ 4.1.0) |
| URL: | https://github.com/y-guang/eam |
| NeedsCompilation: | yes |
| Packaged: | 2026-01-12 22:46:13 UTC; spike |
| Author: | Guangyu Zhu |
| Maintainer: | Guang Yang <guang.spike.yang@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-01-17 20:10:02 UTC |
Add two summarise_by specs together
Description
S3 method for the + operator to combine two 'eam_summarise_by_spec' objects into a single spec that will apply both operations.
Usage
## S3 method for class 'eam_summarise_by_spec'
e1 + e2
Arguments
e1 |
First eam_summarise_by_spec or eam_summarise_by_tbl object |
e2 |
Second eam_summarise_by_spec or eam_summarise_by_tbl object |
Value
A combined eam_summarise_by_spec object
Join two eam_summarise_by_tbl objects
Description
S3 method for the + operator to join two summary tables created by
summarise_by. Tables must have identical .wider_by attributes
to be joined.
Usage
## S3 method for class 'eam_summarise_by_tbl'
e1 + e2
Arguments
e1 |
First eam_summarise_by_tbl object |
e2 |
Second eam_summarise_by_tbl object |
Value
A joined data frame with class "eam_summarise_by_tbl", preserving the .wider_by attribute from the input tables
Bootstrap resample ABC posterior samples
Description
Bootstrap resample ABC posterior samples
Usage
abc_posterior_bootstrap(abc_result, n_samples, replace = TRUE)
Arguments
abc_result |
An abc object from |
n_samples |
Number of bootstrap samples to draw (default 1000) |
replace |
Logical, whether to sample with replacement (default TRUE) |
Value
Data frame of bootstrapped parameter values
Examples
# Load an example abc output, you should generate it by applying ABC to your data
# check abc::abc for details on fitting ABC models
rdm_minimal_example <- system.file("extdata", "rdm_minimal", package = "eam")
abc_model <- readRDS(file.path(rdm_minimal_example, "abc", "abc_neuralnet_model.rds"))
# Bootstrap resample posterior parameters
posterior_params <- abc_posterior_bootstrap(
abc_model,
n_samples = 100
)
# View the first few rows of the bootstrapped posterior parameters
head(posterior_params)
ABC model comparison wrapper
Description
Wrapper function for postpr to facilitate model comparison.
This function simplifies the process of comparing multiple models using ABC by
automatically stacking summary statistics and creating model indices.
Usage
abc_postpr(sumstats = list(), target, ...)
Arguments
sumstats |
A named list of summary statistics matrices from different models. Each element should be a matrix or data frame with the same columns. |
target |
Target summary statistics from observed data (vector or matrix) |
... |
Additional arguments passed to |
Value
An object of class "postpr" from postpr
Examples
# Load pre-computed ABC input for model comparison
# This example compares the same model to itself for demonstration
rdm_minimal_example <- system.file("extdata", "rdm_minimal", package = "eam")
abc_input <- readRDS(file.path(rdm_minimal_example, "abc", "abc_input.rds"))
# Compare two models using their summary statistics
# In practice, create different abc_input objects for different models:
# abc_input_1 <- build_abc_input(..., simulation_summary = sim_summary_1, ...)
# abc_input_2 <- build_abc_input(..., simulation_summary = sim_summary_2, ...)
postpr_result <- abc_postpr(
sumstats = list(model1 = abc_input$sumstat, model2 = abc_input$sumstat),
target = abc_input$target,
tol = 0.5,
method = "rejection"
)
# View model comparison results
summary(postpr_result)
ABC with resampling
Description
Performs ABC inference with resampling to assess stability and uncertainty. Each iteration draws a random sample from the simulation pool and runs ABC, producing multiple posterior estimates for comparison.
Usage
abc_resample(
target,
param,
sumstat,
n_iterations,
n_samples,
replace = FALSE,
...
)
Arguments
target |
Target summary statistics from observed data |
param |
Parameter values matrix or data frame |
sumstat |
Summary statistics matrix or data frame |
n_iterations |
Number of resample iterations |
n_samples |
Number of samples to draw in each iteration |
replace |
Logical, whether to sample with replacement (default FALSE) |
... |
Additional arguments passed to abc::abc |
Value
A list of length n_iterations, where each element is an object
of class abc returned by abc. Each list element
contains the ABC posterior for one bootstrap iteration, allowing assessment
of stability and uncertainty in parameter estimates.
Examples
# Load ABC input data from example simulation
abc_input <- readRDS(
system.file("extdata", "rdm_minimal", "abc", "abc_input.rds", package = "eam")
)
# Perform ABC resampling
results <- abc_resample(
target = abc_input$target,
param = abc_input$param,
sumstat = abc_input$sumstat,
n_iterations = 2,
n_samples = 2,
tol = 0.5,
method = "rejection"
)
# check the abc results
str(results)
Simulate evidence accumulation in a drift-diffusion model
Description
Simulate evidence accumulation in a drift-diffusion model
Usage
accumulate_evidence_ddm(
A,
V,
Z,
ndt,
max_t,
dt,
max_reached,
noise_mechanism = "add",
noise_func = NULL
)
Simulate evidence accumulation in a two-bound drift-diffusion model
Description
Simulate evidence accumulation in a two-bound drift-diffusion model
Usage
accumulate_evidence_ddm_2b(
A_upper,
A_lower,
V,
Z,
ndt,
max_t,
dt,
max_reached,
noise_mechanism = "add",
noise_func = NULL
)
Simulate evidence accumulation in a leaky competing accumulator model with Global Inhibition (LCA-GI)
Description
Simulate evidence accumulation in a leaky competing accumulator model with Global Inhibition (LCA-GI)
Usage
accumulate_evidence_lca_gi(
A,
V,
Z,
ndt,
beta,
k,
max_t,
dt,
max_reached,
noise_func = NULL
)
Internal function to apply a spec to data
Description
Internal function to apply a spec to data
Usage
apply_summarise_by_spec(spec_list, .data)
Arguments
spec_list |
A list of spec operations (the internal spec list) |
.data |
A data frame |
Value
A data frame with class "eam_summarise_by_tbl"
Build input for Approximate Bayesian Computation (ABC)
Description
Prepares simulation output, summary statistics, and target data for ABC
analysis using the abc package. Extracts parameters and summary
statistics from simulation results and formats them into matrices suitable
for ABC parameter estimation.
Usage
build_abc_input(simulation_output, simulation_summary, target_summary, param)
Arguments
simulation_output |
A eam_simulation_output object containing that is from
|
simulation_summary |
A data frame containing summary statistics for each
simulated condition. Should have a 'condition_idx' column and be created by
|
target_summary |
A data frame containing target summary statistics to match against simulation results. Should have the same summary statistic columns as simulation_summary (excluding 'wider_by' columns). |
param |
Character vector of parameter names to extract from simulation_output. These parameters will be used as the parameter space for ABC estimation. |
Details
This function provides a streamlined workflow for preparing ABC inputs, but
it requires that all components be constructed using this package's functions.
Specifically, simulation_output must be created by run_simulation
or load_simulation_output, and both simulation_summary and
target_summary must be generated using summarise_by. If
your data originates from external sources or custom pipelines, you should
manually construct the ABC input list instead, ensuring proper matrix formatting
and column alignment as expected by abc::abc.
Value
A list with components suitable for abc::abc
Required format for summary statistics
Both simulation_summary and target_summary must be created using
summarise_by.
This ensures consistent column naming and data structure required for ABC analysis.
See summarise_by for details on generating properly formatted summaries,
and map_by_condition for typical workflow examples.
If you want more flexibility in summary statistic calculation, you can manually
construct the ABC input list. It is not necessary to use this function if you
are familiar with the abc package.
Examples
# Load the example dataset
rdm_minimal_example <- system.file("extdata", "rdm_minimal", package = "eam")
sim_output <- load_simulation_output(file.path(rdm_minimal_example, "simulation"))
obs_df <- read.csv(file.path(rdm_minimal_example, "observation", "observation_data.csv"))
# Define summary statistics pipeline
summary_pipe <- summarise_by(
.by = c("condition_idx"),
rt_mean = mean(rt)
)
# Calculate summary statistics for simulation and observation
sim_summary <- map_by_condition(
sim_output,
.progress = FALSE,
.parallel = FALSE,
function(cond_df) {
summary_pipe(cond_df)
}
)
obs_summary <- summary_pipe(obs_df)
# Build ABC input
abc_input <- build_abc_input(
simulation_output = sim_output,
simulation_summary = sim_summary,
target_summary = obs_summary,
param = c("V_beta_1", "V_beta_group")
)
# Perform ABC parameter estimation using rejection method
abc_rejection_model <- abc::abc(
target = abc_input$target,
param = abc_input$param,
sumstat = abc_input$sumstat,
tol = 0.5,
method = "rejection"
)
Calculate total number of rows needed for flattened data
Description
This function counts the total number of items across all conditions and trials to determine the size needed for pre-allocation.
Usage
calculate_total_rows(sim_results, first_trial_col)
Arguments
sim_results |
The output from run_simulation(), a list of conditions |
first_trial_col |
Name of the first trial column to use for counting |
Value
Integer, total number of rows needed
Backend detector for standard DDM
Description
Backend detector for standard DDM
Usage
detect_backend_ddm(model_lower, config)
Arguments
model_lower |
Lowercase model name |
config |
A list containing simulation configuration parameters |
Value
Backend name if this detector handles the config, NULL otherwise
Backend detector for 2-boundary DDM
Description
Backend detector for 2-boundary DDM
Usage
detect_backend_ddm_2b(model_lower, config)
Arguments
model_lower |
Lowercase model name |
config |
A list containing simulation configuration parameters |
Value
Backend name if this detector handles the config, NULL otherwise
Backend detector for LCA-GI
Description
Backend detector for LCA-GI
Usage
detect_backend_lca_gi(model_lower, config)
Arguments
model_lower |
Lowercase model name |
config |
A list containing simulation configuration parameters |
Value
Backend name if this detector handles the config, NULL otherwise
Evaluate a list of formulas sequentially with data
Description
This function evaluates a list of formulas sequentially, allowing later formulas to reference
Usage
evaluate_with_dt(formulas, data = list(), n)
Arguments
formulas |
A list of formulas to evaluate |
data |
A list of named values to use as the initial environment |
n |
The number of values to generate for each formula |
Value
A named list of evaluated values with length n
Extract parameter values from abc result
Description
Extract parameter values from abc result
Usage
extract_abc_param_values(abc_result)
Arguments
abc_result |
Single abc result object |
Value
Matrix of parameter values
Extract posterior medians from abc_resample output
Description
Internal helper to compute parameter medians across abc_resample iterations.
Usage
extract_resample_medians(resample_results)
Arguments
resample_results |
List of abc results from abc_resample |
Value
Matrix where each row is an iteration and each column is parameter median
Fill pre-allocated data.table with simulation results
Description
This function fills the pre-allocated data.table vectors with data from simulation results, iterating through all conditions and trials.
Usage
fill_data_table(
sim_results,
dt_lists,
trial_col_names,
cond_param_names,
first_trial_col
)
Arguments
sim_results |
The output from run_simulation(), a list of conditions |
dt_lists |
Named list of pre-allocated vectors for each column |
trial_col_names |
Character vector of trial column names |
cond_param_names |
Character vector of condition parameter names |
first_trial_col |
Name of the first trial column to use for item counting |
Value
Named list of filled vectors ready for data.table creation
Convert simulation results to a tidy data.table
Description
This function takes the nested list output from run_simulation() and converts it into a tidy data.table where each row represents one item response. The function pre-allocates the data.table to the exact size needed and then fills it efficiently. Column names are dynamically determined from the first trial result, excluding any .item_params verbose output.
Usage
flatten_simulation_results(sim_results)
Arguments
sim_results |
The output from run_simulation(), a list of conditions |
Value
A data.table with columns: condition_idx, trial_idx, rank_idx, all columns from trial results (excluding .item_params), and all variables from cond_params
Get all registered backend detectors
Description
Get all registered backend detectors
Usage
get_backend_detectors()
Value
A list of backend detector functions
Extract column names from simulation results
Description
This function extracts trial column names and condition parameter names from simulation results structure.
Usage
get_column_names(sim_results)
Arguments
sim_results |
The output from run_simulation(), a list of conditions |
Value
A list with trial_col_names and cond_param_names
Extract all left-hand side variable names from config formulas and prior_params
Description
Extract all left-hand side variable names from config formulas and prior_params
Usage
get_config_env_names(config)
Arguments
config |
A list containing simulation configuration parameters |
Value
A character vector of all LHS variable names from formulas and prior_params columns
Initialize simulation output directory structure
Description
Creates and validates the output directory structure for a simulation. This function ensures the directory is empty (or creates it), then creates the required subdirectories based on simulation_output_fs_proto.
Usage
init_simulation_output_dir(output_dir)
Arguments
output_dir |
The base output directory path |
Value
The output_dir path (invisibly for chaining)
Rebuild eam_simulation_output from an existing output directory
Description
This function reconstructs a eam_simulation_output object from a previously saved simulation output directory.
Usage
load_simulation_output(output_dir)
Arguments
output_dir |
The directory containing the simulation results and config |
Value
A eam_simulation_output object
Examples
# Load simulation output from package data
sim_output_path <- system.file(
"extdata", "rdm_minimal", "simulation",
package = "eam"
)
sim_output <- load_simulation_output(sim_output_path)
# Access the configuration
sim_output$simulation_config
# Access the dataset (check arrow documentation for working with the dataset)
dataset <- sim_output$open_dataset()
Map a function by condition across simulation output chunks
Description
This function processes simulation output by gathering all chunks, iterating through them one by one, filtering and collecting data by chunk, then applying a user-defined function by condition within each chunk.
Usage
map_by_condition(
simulation_output,
.f,
...,
.combine = dplyr::bind_rows,
.parallel = NULL,
.n_cores = NULL,
.progress = FALSE
)
Arguments
simulation_output |
A eam_simulation_output object containing the dataset and configuration |
.f |
A function to apply to each condition's data. The function should accept a data frame representing one condition's results |
... |
Additional arguments passed to the function .f |
.combine |
Function to combine results (default: dplyr::bind_rows) |
.parallel |
Logical or NULL. |
.n_cores |
Integer. Number of CPU cores to use for parallel processing.
If NULL, uses |
.progress |
Logical, whether to show a progress bar (default: FALSE) |
Details
This function handles out-of-core computation automatically using Apache Arrow, so you don't need to understand Arrow internals. It loads data chunk by chunk to avoid memory issues with large simulations.
If you prefer to manually work with the raw Arrow dataset, you can access it
via simulation_output$open_dataset(), which returns an Arrow Dataset
object. You can then use dplyr verbs to filter and query before calling
dplyr::collect() to load data into memory.
Value
A list containing the results of applying .f to each condition, with names corresponding to condition indices
Examples
# Load simulation output
sim_output_path <- system.file(
"extdata", "rdm_minimal", "simulation",
package = "eam"
)
sim_output <- load_simulation_output(sim_output_path)
# Define a summary pipeline
summary_pipe <- summarise_by(
.by = c("condition_idx"),
rt_mean = mean(rt),
rt_quantiles = quantile(rt, probs = c(0.1, 0.5, 0.9))
)
# Apply function to each condition
sim_sumstat <- map_by_condition(
sim_output,
.progress = FALSE,
.parallel = FALSE,
function(cond_df) {
summary_pipe(cond_df)
}
)
Heuristic to determine if parallel processing should be used
Description
Heuristic to determine if parallel processing should be used
Usage
map_by_condition.parallel.heuristic(chunk_indices)
Arguments
chunk_indices |
Vector of chunk indices |
Value
Logical value indicating whether to use parallel processing
Process a single chunk for map_by_condition
Description
Process a single chunk for map_by_condition
Usage
map_by_condition.process_chunk(open_dataset_fn, .f, ...)
Arguments
open_dataset_fn |
Arrow dataset object or function that returns a dataset |
.f |
Function to apply to each condition's data |
... |
Additional arguments passed to .f |
Value
Function that processes a chunk_idx
Create a new simulation configuration
Description
This function creates a new eam simulation configuration object that contains all parameters needed to run a simulation.
Usage
new_simulation_config(
prior_params = list(),
prior_formulas = list(),
between_trial_formulas = list(),
item_formulas = list(),
n_conditions_per_chunk = NULL,
n_conditions,
n_trials_per_condition,
n_items,
max_reached = n_items,
max_t,
dt = 0.001,
noise_mechanism = "add",
noise_factory = NULL,
model = "ddm",
parallel = FALSE,
n_cores = NULL,
rand_seed = NULL
)
Arguments
prior_params |
A list or data frame of initial values for prior |
prior_formulas |
A list of formulas defining prior distributions for condition-level parameters |
between_trial_formulas |
A list of formulas defining between-trial parameters |
item_formulas |
A list of formulas defining item-level parameters |
n_conditions_per_chunk |
Number of conditions to process per chunk (optional, typically does not need to be set. It determine the storage and in-memory size of each chunk, if you find memory issues, try reducing this number) |
n_conditions |
Total number of conditions to simulate |
n_trials_per_condition |
Number of trials per condition |
n_items |
Number of items per trial |
max_reached |
Maximum number of items that can be recalled (default: n_items) |
max_t |
Maximum simulation time |
dt |
Time step size (default: 0.001) |
noise_mechanism |
Noise mechanism ("add", "mult_evidence", or "mult_t", default: "add") |
noise_factory |
Function that creates noise functions. |
model |
Model name or backend names (e.g., "ddm", "rdm", "lca") |
parallel |
Whether to run in parallel (default: FALSE) |
n_cores |
Number of cores for parallel processing (default: NULL, auto-detect) |
rand_seed |
Random seed for parallel processing (default: NULL) |
Details
This function only creates the configuration object and does not run the
simulation. To actually execute the simulation, you must pass the returned
configuration object to run_simulation.
Supported Models:
This package supports three evidence accumulation models. The appropriate
backend is automatically selected based on the model parameter and
the parameters defined in your formulas.
- DDM (Drift Diffusion Model)
-
Models evidence accumulation towards a single upper threshold. Items either reach the threshold and are recalled, or time out.
Required parameters (must appear in
prior_formulas,between_trial_formulas, oritem_formulas):-
A- Upper decision boundary/threshold -
V- Drift rate (evidence accumulation rate) -
Z- Starting point of evidence -
ndt- Non-decision time
Set
model = "ddm" -
- RDM (Racing Diffusion Model)
-
Models multiple racing evidence accumulators, each with upper and lower thresholds for binary decisions (correct/incorrect).
Required parameters:
-
A_upper- Upper decision boundary (correct response) -
A_lower- Lower decision boundary (incorrect response) -
V- Drift rate -
Z- Starting point of evidence -
ndt- Non-decision time
Set
model = "rdm". Note: If you setmodel = "ddm"but defineA_upperinstead ofA, the model will automatically switch to RDM. -
- LCA (Leaky Competing Accumulator)
-
Models competitive evidence accumulation with leakage and mutual inhibition between accumulators.
Required parameters:
-
A- Decision threshold -
V- Input strength/drift rate -
Z- Starting point of evidence -
ndt- Non-decision time -
beta- Self-excitation/leak parameter -
k- Lateral inhibition strength
Set
model = "lca" -
- LFM (Lévy Flight Model)
-
Uses the same parameters as
DDM. SeeDDMabove.Set
model = "lfm" - LBA (Linear Ballistic Accumulator)
-
Uses the same parameters as
RDM. SeeRDMabove.Set
model = "lba"
Note: All required parameters must be defined at least once across
prior_params, prior_formulas, between_trial_formulas, and
item_formulas.
Parameter Hierarchy and Formula Evaluation:
The simulation uses a hierarchical parameter system with sequential formula evaluation, allowing later formulas to reference earlier ones:
-
prior_params - Initial constant values available to all formulas
-
prior_formulas - Evaluated once per condition, can reference
prior_params. Use for condition-level parameters that vary across conditions. -
between_trial_formulas - Evaluated once per trial within each condition. Can reference both
prior_paramsand variables fromprior_formulas. Use for trial-level variability. -
item_formulas - Evaluated once per item within each trial. Can reference all previous parameters. Use for item-specific parameters.
Using Distributions:
You can use the distributional package to define random parameters.
For example:
-
A ~ distributional::dist_uniform(0.5, 2.0)- Uniform distribution -
V_condition ~ distributional::dist_normal(1.0, 0.2)- Normal distribution -
sigma ~ 0.5- Constant value -
V ~ distributional::dist_normal(V_condition, sigma)- Reference earlier parameters
Each formula is evaluated sequentially, so you can build complex parameter
dependencies. For instance, you might define a base drift rate V in
prior_formulas, then add trial-level noise in
between_trial_formulas, and finally scale by item position in
item_formulas.
Value
An S3 object of class eam_simulation_config containing validated
simulation parameters. This object should be passed to
run_simulation to execute the simulation.
Examples
# Define formulas for the simulation
prior_formulas <- list(
V ~ distributional::dist_uniform(0.1, 1.0),
ndt ~ 0.3,
noise_coef ~ 1
)
between_trial_formulas <- list()
item_formulas <- list(
A_upper ~ 1,
A_lower ~ -1,
V ~ V
)
# Define noise factory
noise_factory <- function(context) {
noise_coef <- context$noise_coef
function(n, dt) {
noise_coef * rnorm(n, mean = 0, sd = sqrt(dt))
}
}
# Create configuration
config <- new_simulation_config(
prior_formulas = prior_formulas,
between_trial_formulas = between_trial_formulas,
item_formulas = item_formulas,
n_conditions = 10,
n_trials_per_condition = 10,
n_items = 5,
max_reached = 5,
max_t = 10,
dt = 0.01,
noise_mechanism = "add",
noise_factory = noise_factory,
model = "ddm",
parallel = FALSE
)
# print the config
config
# Run simulation
sim_output <- run_simulation(config)
sim_output
Heuristic to calculate optimal chunk size for simulation configuration
Description
Heuristic to calculate optimal chunk size for simulation configuration
Usage
new_simulation_config.chunk_size.heuristic(
n_conditions,
n_trials_per_condition,
n_items,
parallel,
n_cores
)
Arguments
n_conditions |
Total number of conditions to simulate |
n_trials_per_condition |
Number of trials per condition |
n_items |
Number of items per trial |
parallel |
Whether to run in parallel |
n_cores |
Number of cores for parallel processing |
Value
Optimal number of conditions per chunk
Create a eam_simulation_output object
Description
Create a eam_simulation_output object
Usage
new_simulation_output(simulation_config, output_dir)
Plot accuracy comparison between posterior and observed data
Description
Visualizes accuracy metrics comparing posterior simulation results with observed data. Creates side-by-side bar plots for easy comparison across conditions.
Usage
plot_accuracy(
simulated_output,
observed_df,
x = "item_idx",
facet_x = c(),
facet_y = c()
)
Arguments
simulated_output |
Posterior simulation output from run_simulation() |
observed_df |
Observed data frame |
x |
Variable for x-axis (default: "item_idx") |
facet_x |
Variables for faceting columns |
facet_y |
Variables for faceting rows |
Value
A ggplot2 object
Examples
# Load posterior simulation output and observed data
base_dir <- system.file("extdata", "rdm_minimal", package = "eam")
post_output <- load_simulation_output(file.path(base_dir, "abc", "posterior", "neuralnet"))
obs_df <- read.csv(file.path(base_dir, "observation", "observation_data.csv"))
# Plot accuracy comparison between posterior and observed data
# The plot shows side-by-side bars comparing hit rates or accuracy
plot_accuracy(
post_output,
obs_df,
facet_x = c("group")
)
Plot accuracy for DDM model (internal)
Description
Calculates hit rate (proportion of trials with choice == 1) across all possible trial combinations. For simulated data, expands grid based on simulation config parameters and left joins with actual simulation results. For observed data, assumes data is already in the correct format.
Usage
plot_accuracy_ddm(
simulated_output,
observed_df,
x = "item_idx",
facet_x = c(),
facet_y = c()
)
Arguments
simulated_output |
Simulation output object |
observed_df |
Observed data frame (already expanded with all trial combinations) |
x |
Variable for x-axis |
facet_x |
Variables for faceting columns |
facet_y |
Variables for faceting rows |
Value
A ggplot2 object
Plot accuracy for DDM-2B model (internal)
Description
Plot accuracy for DDM-2B model (internal)
Usage
plot_accuracy_ddm_2b(
simulated_output,
observed_df,
x = "item_idx",
facet_x = c(),
facet_y = c()
)
Arguments
simulated_output |
Simulation output object |
observed_df |
Observed data frame |
x |
Variable for x-axis |
facet_x |
Variables for faceting columns |
facet_y |
Variables for faceting rows |
Value
A ggplot2 object
Plot accuracy graph (internal)
Description
Plot accuracy graph (internal)
Usage
plot_accuracy_graph(
accuracy_df,
x = "item_idx",
y = "accuracy",
facet_x = c(),
facet_y = c()
)
Arguments
accuracy_df |
Data frame with accuracy values |
x |
Variable for x-axis |
y |
Variable for y-axis (default: "accuracy") |
facet_x |
Variables for faceting columns |
facet_y |
Variables for faceting rows |
Value
A ggplot2 object
Plot CV parameter pair correlations
Description
Create a matrix of pairwise plots for cross-validation parameter estimates, including scatter plots with fitted trends, rank correlations, and marginal distributions.
Usage
plot_cv_pair_correlation(data, ...)
## S3 method for class 'cv4abc'
plot_cv_pair_correlation(data, ...)
Arguments
data |
A |
... |
Additional arguments:
|
Value
Invisibly returns 'NULL'. Called for its side effect of producing plots.
See Also
plot_cv_pair_correlation.cv4abc
Examples
# Load CV output from saved file
cv_file <- system.file(
"extdata", "rdm_minimal", "abc", "cv", "neuralnet.rds",
package = "eam"
)
abc_neuralnet_cv <- readRDS(cv_file)
# Plot parameter pair correlations
plot_cv_pair_correlation(abc_neuralnet_cv)
Plot CV parameter recovery
Description
Visualize parameter recovery from cross-validation results, showing estimated vs. true parameter values and residual distributions for each parameter.
Usage
plot_cv_recovery(data, ...)
## S3 method for class 'cv4abc'
plot_cv_recovery(data, ...)
Arguments
data |
A |
... |
Additional arguments:
|
Value
Invisibly returns 'NULL'. Called for its side effect of producing plots.
See Also
Examples
# Load CV output from saved file
cv_file <- system.file(
"extdata", "rdm_minimal", "abc", "cv", "neuralnet.rds",
package = "eam"
)
abc_neuralnet_cv <- readRDS(cv_file)
# Plot parameter recovery
plot_cv_recovery(
abc_neuralnet_cv,
n_rows = 2,
n_cols = 1,
resid_tol = 0.99
)
Plot parameter posterior distributions
Description
Plotting posterior distributions (and optionally prior distributions) from ABC results.
Usage
plot_posterior_parameters(data, ...)
## S3 method for class 'abc'
plot_posterior_parameters(data, abc_input = NULL, ...)
Arguments
data |
An |
... |
Additional arguments:
|
abc_input |
Optional abc_input object containing prior samples for comparison. |
Value
Invisibly returns 'NULL'. Called for its side effect of producing plots.
See Also
Examples
# Load ABC output from saved file
abc_file <- system.file(
"extdata", "rdm_minimal", "abc", "abc_rejection_model.rds",
package = "eam"
)
abc_rejection_model <- readRDS(abc_file)
# Load ABC input for prior comparison
abc_input_file <- system.file(
"extdata", "rdm_minimal", "abc", "abc_input.rds",
package = "eam"
)
abc_input <- readRDS(abc_input_file)
# Plot posterior distributions with prior comparison
plot_posterior_parameters(abc_rejection_model, abc_input)
Plot resample forest plots
Description
Create forest plots showing parameter ranges across resample iterations. Each iteration is displayed as a horizontal line with quantile intervals.
Usage
plot_resample_forest(
data,
n_rows = 2,
n_cols = 2,
interactive = FALSE,
ci_level = 0.95
)
Arguments
data |
List of abc results from abc_resample |
n_rows |
Number of rows in plot grid (default 2) |
n_cols |
Number of columns in plot grid (default 2) |
interactive |
Whether to pause between pages (default FALSE) |
ci_level |
quantile intervals (default 0.95 for 95% interval) |
Value
No return value, called for side effects (plotting). Creates forest plots displayed in the graphics device.
Examples
# Load ABC input data from example simulation
abc_input <- readRDS(
system.file("extdata", "rdm_minimal", "abc", "abc_input.rds", package = "eam")
)
# Perform ABC resampling
results <- abc_resample(
target = abc_input$target,
param = abc_input$param,
sumstat = abc_input$sumstat,
n_iterations = 100,
n_samples = 100,
tol = 0.5,
method = "rejection"
)
# plot forest plots showing parameter ranges
plot_resample_forest(results, ci_level = 0.95)
Plot resample median distributions
Description
Plot density distributions of parameter medians across resample iterations.
Usage
plot_resample_medians(data, n_rows = 2, n_cols = 2, interactive = FALSE)
Arguments
data |
List of abc results from abc_resample |
n_rows |
Number of rows in plot grid (default 2) |
n_cols |
Number of columns in plot grid (default 2) |
interactive |
Whether to pause between pages (default FALSE) |
Value
No return value, called for side effects (plotting). Creates density plots displayed in the graphics device.
Examples
# Load ABC input data from example simulation
abc_input <- readRDS(
system.file("extdata", "rdm_minimal", "abc", "abc_input.rds", package = "eam")
)
# Perform ABC resampling
results <- abc_resample(
target = abc_input$target,
param = abc_input$param,
sumstat = abc_input$sumstat,
n_iterations = 100,
n_samples = 100,
tol = 0.5,
method = "rejection"
)
# plot the resample medians for each parameter
plot_resample_medians(results)
Plot reaction time distributions
Description
Visualize reaction time distributions from your model predictions. Overlay observed experimental data for reference.
Usage
plot_rt(simulated_output, observed_df, facet_x = c("item_idx"), facet_y = c())
Arguments
simulated_output |
Output from |
observed_df |
Your observed data as a data frame |
facet_x |
Variables to split plots horizontally. Default is |
facet_y |
Variables to split plots vertically. Default is none ( |
Value
A plot showing predicted RT distributions (blue), with observed data (red) if provided
Examples
# Load example posterior simulation output
post_output_path <- system.file(
"extdata", "rdm_minimal", "abc", "posterior", "neuralnet",
package = "eam"
)
post_output <- load_simulation_output(post_output_path)
# Load example observed data
obs_file <- system.file(
"extdata", "rdm_minimal", "observation", "observation_data.csv",
package = "eam"
)
obs_df <- read.csv(obs_file)
# Plot RT distributions by item
plot_rt(post_output, obs_df, facet_x = c("item_idx"))
# Plot RT distributions by item and group
plot_rt(
post_output,
obs_df,
facet_x = c("item_idx"),
facet_y = c("group")
)
Pre-allocate data.table columns with appropriate data types
Description
This function creates pre-allocated vectors for all columns in the final data.table, determining data types from the first trial and condition.
Usage
preallocate_columns(sim_results, trial_col_names, cond_param_names, total_rows)
Arguments
sim_results |
The output from run_simulation(), a list of conditions |
trial_col_names |
Character vector of trial column names |
cond_param_names |
Character vector of condition parameter names |
total_rows |
Integer, total number of rows to pre-allocate |
Value
Named list of pre-allocated vectors for each column
Print method for eam simulation configuration
Description
Print method for eam simulation configuration
Usage
## S3 method for class 'eam_simulation_config'
print(x, ...)
Arguments
x |
A eam_simulation_config object |
... |
Additional arguments (ignored) |
Value
Invisibly returns the input object
Helper to resolved defined symbols in our formulas
Description
This function evaluates an expression in a given environment.
Usage
resolve_symbol(expr, env, n)
Arguments
expr |
An expression to evaluate |
env |
An environment to evaluate the expression in |
n |
The number of values to generate if the expression is a distribution |
Value
The evaluated value as it is, no assumption on its type
Route model alias to backend and enrich configuration
Description
This function uses a registry of backend detectors to determine which backend implementation should handle the given configuration. Each detector examines the config and returns a backend name if it can handle it, or NULL otherwise. This design pattern (Chain of Responsibility) makes it easy to add new backends without modifying this routing function.
Usage
route_model_to_backend(config)
Arguments
config |
A list containing simulation configuration parameters |
Value
The modified config list with added 'backend' parameter
Run a chunk of simulation conditions and save results to disk
Description
This function processes a chunk of simulation conditions, applies the flatten_simulation_results transformation, and saves the results to disk using Arrow's write_dataset with partitioning by chunk_idx.
Usage
run_chunk(config, output_dir, chunk_idx)
Arguments
config |
A eam_simulation_config object containing all simulation parameters |
output_dir |
The base output directory |
chunk_idx |
The chunk index for partitioning (1-based) |
Value
Invisible NULL (results are saved to disk)
Run a given condition with multiple trials
Description
This function runs multiple trials for a given condition using the specified
Usage
run_condition(
condition_setting,
between_trial_formulas,
item_formulas,
n_trials,
n_items,
max_reached,
max_t,
dt,
noise_mechanism,
noise_factory,
backend,
trajectories = FALSE
)
Arguments
condition_setting |
A list of named values representing the condition settings |
between_trial_formulas |
A list of formulas defining the between-trial parameters |
item_formulas |
A list of formulas defining the item parameters |
n_trials |
The number of trials to simulate |
n_items |
The number of items per trial |
max_reached |
The threshold for evidence accumulation |
max_t |
The maximum time to simulate |
dt |
The step size for each increment |
noise_mechanism |
The noise mechanism to use ("add" or "mult") |
noise_factory |
A function that takes condition_setting and returns a noise function with signature function(n, dt) |
backend |
The backend implementation to use ("ddm", "ddm-2b", or "lca-gi") |
trajectories |
Whether to return full output including trajectories. |
Value
A list containing the simulation results and condition parameters
Run a simulation with specified configuration
Description
This function runs a complete simulation based on the provided
eam_simulation_config object, which is generated by the
new_simulation_config function.
Usage
run_simulation(config, output_dir = NULL)
Arguments
config |
A eam_simulation_config object containing all simulation
parameters, you should use |
output_dir |
The directory to save out-of-core results (optional, will use temp directory if not provided) |
Details
This function uses an out-of-core approach to handle potentially large
simulation results. Instead of returning a data frame directly, it persists
the data to disk and returns an eam_simulation_output object that
contains metadata and file system paths.
To access the simulation data, use the following methods on the returned object:
-
open_dataset()- Returns an Arrow Dataset containing the simulation results, e.g.sim_output$open_dataset() -
open_evaluated_conditions()- Returns an Arrow Dataset containing the evaluated condition parameters, e.g.sim_output$open_evaluated_conditions()
Both methods return Arrow Dataset objects rather than data frames, allowing
for efficient querying and filtering before loading data into memory. To
convert to a data frame, use dplyr::collect() or
as.data.frame().
Throughout this package, the eam_simulation_output object is used as
the standard parameter for downstream analysis functions, rather than
passing Arrow objects or data frames directly.
For multi-item backends, at each discrete time point, only one item can
reach the threshold.
The precision of this detection depends on the dt
parameter. This design choice was made for performance considerations. For
almost all experimental scenarios, it is negligible.
But users should be aware of this limitation, if it is critical, try to
increase the temporal resolution by reducing dt.
For implementation details,
refer to the backend source code (accumulate_evidence_* functions).
Value
A S3 object of class eam_simulation_output containing the output information
Examples
# Define formulas for the simulation
prior_formulas <- list(
V ~ distributional::dist_uniform(0.1, 1.0),
ndt ~ 0.3,
noise_coef ~ 1
)
between_trial_formulas <- list()
item_formulas <- list(
A_upper ~ 1,
A_lower ~ -1,
V ~ V
)
# Define noise factory
noise_factory <- function(context) {
noise_coef <- context$noise_coef
function(n, dt) {
noise_coef * rnorm(n, mean = 0, sd = sqrt(dt))
}
}
# Create configuration
config <- new_simulation_config(
prior_formulas = prior_formulas,
between_trial_formulas = between_trial_formulas,
item_formulas = item_formulas,
n_conditions = 10,
n_trials_per_condition = 10,
n_items = 5,
max_reached = 5,
max_t = 10,
dt = 0.01,
noise_mechanism = "add",
noise_factory = noise_factory,
model = "ddm",
parallel = FALSE
)
# Run simulation
sim_output <- run_simulation(config)
# Access results
dataset <- sim_output$open_dataset()
dataset # an arrow dataset object
# if you want to load it into memory, you can use:
df <- as.data.frame(dataset)
head(df)
# Access evaluated condition parameters
cond_dataset <- sim_output$open_evaluated_conditions()
df_cond <- as.data.frame(cond_dataset)
head(df_cond)
Run a full simulation across multiple conditions in parallel
Description
This function runs a complete simulation across multiple conditions using parallel processing. It splits the conditions into chunks and processes each chunk on separate cores. Each condition has multiple trials and items. It uses the hierarchical structure: prior -> condition -> trial -> item. All parameters are taken from the configuration object.
Usage
run_simulation_parallel(config, output_dir)
Arguments
config |
A eam_simulation_config object |
output_dir |
The base output directory |
Value
No return value (results saved to disk)
Run a full simulation across multiple conditions (serial version)
Description
This function runs a complete simulation across multiple conditions serially, with each condition having multiple trials and items. It uses the hierarchical structure: prior -> condition -> trial -> item. All parameters are taken from the configuration object.
Usage
run_simulation_serial(config, output_dir)
Arguments
config |
simulation config object |
output_dir |
The base output directory |
Value
No return value (results saved to disk)
Run a single trial of the DDM simulation
Description
This function runs a single trial of the DDM simulation using the provided item formulas and trial settings. It's a wrapper around the core C++ function
Usage
run_trial_ddm(
trial_setting,
item_formulas,
n_items,
max_reached,
max_t,
dt,
noise_mechanism,
noise_factory,
trajectories = FALSE
)
Arguments
trial_setting |
A list of named values representing the trial settings |
item_formulas |
A list of formulas defining the item parameters |
n_items |
The number of items to simulate |
max_reached |
The threshold for evidence accumulation |
max_t |
The maximum time to simulate |
dt |
The step size for each increment |
noise_mechanism |
The noise mechanism to use ("add" or "mult") |
noise_factory |
A function that takes trial_setting and returns a noise function with signature function(n, dt) |
trajectories |
Whether to return full output including trajectories. |
Value
A list containing the simulation results
Note
After evaluation, parameters A, V, and ndt are expected to be numeric vectors of length n_items. And they are matched by position. So, the first element of A, V, and ndt corresponds to the first item, and so on.
Run a single trial of the 2-boundary DDM simulation
Description
This function runs a single trial of the 2-boundary DDM simulation using the provided item formulas and trial settings. It's a wrapper around the core C++ function for 2-boundary DDM.
Usage
run_trial_ddm_2b(
trial_setting,
item_formulas,
n_items,
max_reached,
max_t,
dt,
noise_mechanism,
noise_factory,
trajectories = FALSE
)
Arguments
trial_setting |
A list of named values representing the trial settings |
item_formulas |
A list of formulas defining the item parameters |
n_items |
The number of items to simulate |
max_reached |
The threshold for evidence accumulation |
max_t |
The maximum time to simulate |
dt |
The step size for each increment |
noise_mechanism |
The noise mechanism to use ("add", "mult_evidence", or "mult_t") |
noise_factory |
A function that takes trial_setting and returns a noise function with signature function(n, dt) |
trajectories |
Whether to return full output including trajectories. |
Value
A list containing the simulation results
Note
After evaluation, parameters A_upper, A_lower, V, and ndt are expected to be numeric vectors of length n_items. And they are matched by position. So, the first element of A_upper, A_lower, V, and ndt corresponds to the first item, and so on.
Run a single trial of the LCA-GI simulation
Description
This function runs a single trial of the LCA-GI (Leaky Competing Accumulator with Global Inhibition) simulation using the provided item formulas and trial settings. It's a wrapper around the core C++ function for LCA-GI.
Usage
run_trial_lca_gi(
trial_setting,
item_formulas,
n_items,
max_reached,
max_t,
dt,
noise_factory,
trajectories = FALSE
)
Arguments
trial_setting |
A list of named values representing the trial settings |
item_formulas |
A list of formulas defining the item parameters |
n_items |
The number of items to simulate |
max_reached |
The threshold for evidence accumulation |
max_t |
The maximum time to simulate |
dt |
The step size for each increment |
noise_factory |
A function that takes trial_setting and returns a noise function with signature function(n, dt) |
trajectories |
Whether to return full output including trajectories. |
Value
A list containing the simulation results
Note
After evaluation, parameters A, V, ndt, beta, and k are expected to be numeric vectors of length n_items. And they are matched by position. So, the first element of A, V, ndt, beta, and k corresponds to the first item, and so on.
Summarise data by groups with optional pivoting
Description
This function provides a flexible way to group data, compute summary statistics, and reshape results. It works similar to 'dplyr::summarise()' but with added capabilities for pivoting results wider.
Usage
summarise_by(
.data = NULL,
...,
.by = c("condition_idx"),
.wider_by = c("condition_idx")
)
Arguments
.data |
A data frame to summarise, or NULL to create a reusable summary function |
... |
Summary expressions using dplyr-style syntax. Named arguments become column names in the output (e.g., 'mean_rt = mean(rt)'). |
.by |
Character vector of grouping column names. Default is "condition_idx". |
.wider_by |
Character vector of columns to keep as identifiers when pivoting. Default is "condition_idx". Must be a subset of '.by'. When '.wider_by' differs from '.by', the extra columns in '.by' will be spread across as column suffixes. |
Details
You can use 'summarise_by()' in two ways: 1. **Direct use**: Pass your data directly and get results immediately 2. **Build-then-apply**: Create reusable summary functions, combine them with '+', then apply to your data later
The build-then-apply approach is useful when you want to compute different types of summaries (e.g., RT statistics and accuracy statistics) and automatically join them together.
Value
- If '.data' is provided: A data frame with summarised results - If '.data' is NULL: A function that can be applied to data later
Usage with ABC workflows
If you plan to use build_abc_input for ABC analysis, you must use
summarise_by() to generate summary statistics (or manually handle the arrow
output format). This function typically works together with map_by_condition
to process simulation results. See map_by_condition for workflow examples.
Examples
# Example 1: Direct use - pass data and get results immediately
trial_data <- data.frame(
condition_idx = rep(1:2, each = 4),
item_idx = rep(1:2, 4),
rt = c(0.5, 0.6, 0.7, 0.8, 0.55, 0.65, 0.75, 0.85),
accuracy = c(1, 1, 0, 1, 1, 0, 1, 1)
)
# Compute mean RT and accuracy by condition and item
result <- summarise_by(
trial_data,
mean_rt = mean(rt),
mean_acc = mean(accuracy),
.by = c("condition_idx", "item_idx"),
.wider_by = "condition_idx"
)
# Result has columns: condition_idx, mean_rt_item_idx_1, mean_rt_item_idx_2, etc.
result
# Example 2: Build-then-apply - create reusable summary functions
# Build separate summary functions for different statistics
rt_summary_pipe <- summarise_by(
mean_rt = mean(rt),
sd_rt = stats::sd(rt),
.by = c("condition_idx", "item_idx"),
.wider_by = "condition_idx"
)
acc_summary_pipe <- summarise_by(
mean_acc = mean(accuracy),
n_trials = length(accuracy),
.by = c("condition_idx", "item_idx"),
.wider_by = "condition_idx"
)
# Combine with + and apply to data
combined_summary_pipe <- rt_summary_pipe + acc_summary_pipe
result <- combined_summary_pipe(trial_data)
# Result has all summaries joined by condition_idx
result
Internal function to perform the core summarise_by logic
Description
Internal function to perform the core summarise_by logic
Usage
summarise_by_impl(.data, dots, .by, .wider_by)
Arguments
.data |
A data frame to summarise |
dots |
Quosures containing the summary expressions |
.by |
Character vector of column names to group by |
.wider_by |
Character vector of column names to keep as identifying columns |
Value
A data frame with class "eam_summarise_by_tbl"
Summarise posterior parameter distributions
Description
Compute summary statistics (mean, median, confidence intervals) for posterior parameters from ABC results.
Usage
summarise_posterior_parameters(data, ...)
## S3 method for class 'abc'
summarise_posterior_parameters(data, ..., ci_level = 0.95)
Arguments
data |
An |
... |
Additional arguments for custom summary functions. Functions passed as named arguments will be applied to each parameter's posterior samples. |
ci_level |
Numeric; confidence interval level (default: 0.95). |
Value
A data frame with summary statistics for each parameter.
See Also
summarise_posterior_parameters.abc
Examples
# Load ABC output from saved file
abc_file <- system.file(
"extdata", "rdm_minimal", "abc", "abc_rejection_model.rds",
package = "eam"
)
abc_rejection_model <- readRDS(abc_file)
# Summarise posterior distributions
summarise_posterior_parameters(abc_rejection_model)
# Custom confidence interval level
summarise_posterior_parameters(abc_rejection_model, ci_level = 0.90)
Summarise resample medians
Description
Calculate summary statistics for parameter medians across resample iterations. Returns mean, median, and confidence intervals of the median distributions.
Usage
summarise_resample_medians(data, ..., ci_level = 0.95)
Arguments
data |
List of abc results from abc_resample |
... |
Additional custom summary functions (named functions) |
ci_level |
Confidence level for intervals (default 0.95) |
Value
Data frame with summary statistics for each parameter
Examples
# Load ABC input data from example simulation
abc_input <- readRDS(
system.file("extdata", "rdm_minimal", "abc", "abc_input.rds", package = "eam")
)
# Perform ABC resampling
results <- abc_resample(
target = abc_input$target,
param = abc_input$param,
sumstat = abc_input$sumstat,
n_iterations = 100,
n_samples = 100,
tol = 0.5,
method = "rejection"
)
# summarise the resample medians
summary_stats <- summarise_resample_medians(results, ci_level = 0.95)
print(summary_stats)