--- title: "sample_from_density()" output: rmarkdown::html_vignette: fig_width: 5 fig_height: 4 self_contained: false vignette: | %\VignetteIndexEntry{sample_from_density()} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r echo = FALSE, warning=FALSE} library(YEAB) ``` ## Introduction Behavioral data often exhibit complex patterns that may not adhere to well-known parametric distributions. Traditional statistical methods reliant on such distributions may therefore prove inadequate in capturing the nuanced behaviors observed in real-world scenarios. Approximating a density distribution from a data sample using Kernel Density Estimation allows researchers to estimate the underlying patterns that could emerge from a specific experiment, thus, the ability to generate synthetic data samples that closely mimic the distribution of observed behavioral data serves as a valuable tool in behavioral analysis. Such samples allow researchers to explore various hypotheses, validate statistical models, and assess the efficacy of experimental interventions in a controlled setting. Here we introduce the `sample_from_density()` function which generates a given `n` amount of data points from a density distribution calculated using KDE (Kernel Density Estimation) from a given data set. The function takes two parameters: - `x` A numeric vector of data points from an distribution. - `n` the number of samples to return. ## Example Let's generate a random sample from a normal distribution: ```{r} set.seed(142) normal_sample <- rnorm(100) ``` Now let's create a sample of 100 data points from the distribution estimated with the `sample_from_density()` function: ```{r} sampled_dist <- sample_from_density(normal_sample, 100) ``` Finally let's compare both distributions: ```{r, echo = FALSE} plot(ks::kde(normal_sample), xlab = "", main = "original vs sample_from_density()", col = "blue", ylim = c(0, 0.5) ) kde_sample <- ks::kde(sampled_dist) plot(kde_sample, col = "red", add = TRUE) legend("topleft", c("sample_from_density", "original"), lty = c(1, 1), col = c("red", "blue") ) ```