--- title: "Using the eFCM Package: An Example with European Winter Precipitation" output: rmarkdown::html_vignette: code_download: false vignette: > %\VignetteIndexEntry{prEurope} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Introduction This vignette illustrates a complete workflow for fitting the exponential factor copula model using the `eFCM` package, including data preprocessing, model fitting, diagnostic checking, and simulation. We illustrate the workflow using precipitation data from a counterfactual climate scenario, chosen here purely as an example dataset. ## Exponential factor copula model The exponential factor copula model (eFCM; Castro-Camilo and Huser, 2020) is stochastically defined as $$ W(s) = Z(s) + V, $$ where: - \( Z(s) \), \( s \in \boldsymbol{S} \subset \mathbb{R}^2 \), is a standard Gaussian process with stationary correlation function \( \rho(h) \), - \( h = \| s_1 - s_2 \| \) is the distance between two locations \( s_1, s_2 \in \boldsymbol{S} \), - \( V \sim \text{Exp}(\lambda) \) is an exponentially distributed random variable with rate parameter $\lambda$, independent of both \( Z(s) \) and the spatial location \(s\). The process W(s) belongs to the class of asymptotically dependent models and may be viewed as a Gaussian location mixture. Dependence is introduced through the common factor V, which induces asymptotic dependence while allowing flexible representation of sub-asymptotic extremal dependence. Because this factor does not vary spatially, the model is particularly well-suited to spatially homogeneous regions, where it is reasonable to assume similar marginal distributions and tail behavior across sites. If \( Z_j = Z(s_j) \), for \( j = 1,\ldots,d \), the multivariate latent Gaussian vector \( \boldsymbol{Z} = (Z_1, \dots, Z_d)^\top \) follows a multivariate normal distribution, \(\boldsymbol{Z} \sim \Phi(\cdot; \Sigma_Z),\), where the covariance matrix \( \Sigma_Z \) is determined by the correlation function \( \rho(h) \). In this application, we adopt the exponential correlation function,a special case of the Matérn class, $$ \rho(h) = \exp\left(-\frac{h}{\delta}\right), $$ where \( \delta > 0 \) is a range parameter controlling the spatial decay. With this, the joint distribution of the process \(W\) is $$F_d^W(\mathbf{w}) = \Phi_{D}(\mathbf{w};\mathbf{\Sigma}_{\mathbf{Z}}) - \sum_{j = 1}^D\exp\left(\frac{\lambda^2}{2} - \lambda w_j\right)\Phi_{D}\left(\mathbf{q}_{j,0}- \mathbf{\mu}_{j,0};\mathbf\Omega_{j,0}\right),$$ where $$ \mathbf{q}_{j,0} = \begin{pmatrix}\mathbf{w}_{-j} - \mathbf{\Sigma}_{\mathbf{Z};-j,j}w_j\\ 0\end{pmatrix}, \quad \mathbf{\mu}_{j,0} = \begin{pmatrix} (w_j - \lambda)(\mathbf{1}_{d-1} - \mathbf{\Sigma}_{\mathbf{Z};-j,j})\\ \lambda -w_j \end{pmatrix},$$ $$\mathbf\Omega_{j,0} =\begin{pmatrix} \mathbf{1}_{d-1}\mathbf{1}_{d-1}^T - 2\mathbf{\Sigma}_{\mathbf{Z};-j,j}+\mathbf{\Sigma}_{\mathbf{Z};-j,-j}&\mathbf{\Sigma}_{\mathbf{Z}; -j,j}-\mathbf{1}_{d-1}\\\mathbf{\Sigma}_{\mathbf{Z}; -j,j}-\mathbf{1}_{d-1}&1\end{pmatrix}, $$ In particular, by setting $d=1$, the marginal distribution may be written as $$F_1^{\mathbf{W}}(w;\lambda) = \Phi(w) - \exp(\lambda^2/2 - \lambda w)\Phi(w - \lambda).$$ ## Data Preparation We start by loading the package and the pre-processed weekly counterfactual precipitation data. Specifically, `counterfactual` contains weekly maxima of precipitation under natural-only forcing, and `LonLat` provides spatial coordinates for each station. ```{r setup, message=FALSE, warning=FALSE} library(eFCM) # Load weekly precipitation maxima (pre-aggregated) data("counterfactual") # matrix/data frame: [weeks × stations] # Load station coordinates data("LonLat") coord <- LonLat ``` We can visualize how spatially averaged weekly maxima evolve over time: ```{r, fig.width = 6, fig.height = 4, fig.align='center'} plot(1:nrow(counterfactual), apply(counterfactual, 1, mean), type = "l", xlab = "Week", ylab = "Mean Precipitation (mm)", main = "Weekly Maxima of Counterfactual Precipitation") abline(h = quantile(apply(counterfactual, 1, mean), 0.9), col = "red", lty = 2) ``` ## Create Spatial Data Objects The function `fdata()` converts the spatiotemporal precipitation data into the format required for model fitting: ```{r, eval=FALSE} cf_data = fdata(counterfactual, coord, cellsize = c(1, 1)) ``` To reduce vignette build time, we load precomputed results: ```{r, echo=T} data(cf_data) data(fit) ``` ## Model Fitting We fit the exponential factor copula model to the counterfactual data using `fcm()`: ```{r, eval=FALSE} fit = fcm(s = 1, cf_data, boot = T, R = 1000) ``` Here, `s = 1` is the index of the grid point and the and the exceedance threshold is set via `thres`, a quantile level in (0,1) (default 0.9), and `boot = TRUE`, `R = 1000` requests 1,000 bootstrap replications for uncertainty quantification. ## Model Summary and Diagnostics ```{r} summary(fit) ``` We can extract point estimates of the parameters. Recall that the parameter estimates ($\lambda$ and $\delta$) describe the strength of the common factor and the spatial range of dependence, respectively: ```{r} coef(fit, method = "hessian") coef(fit, method = "boot") ``` We can also compute the usual model selection criteria indices: ```{r} logLik(fit) AIC(fit) BIC(fit) AICc(fit) ``` ## Goodness-of-Fit We can draw Q-Q plots to compare empirical and model-based upper tail quantiles for a given station. By default, `qqplot()` also overlays a generalized Pareto distribution (GPD) fit; this can be suppressed by setting `gpd = FALSE`. ```{r, fig.width = 4, fig.height = 4.5, fig.align='center'} qqplot(fit, which = 1) ``` We can also assess extremal dependence using the conditional exceedance probability $\chi_h(u)$, which measures the probability of simultaneous exceedances at high but finite thresholds. For two locations $s_1$ and $s_2$ separated by distance $h$, with respective vector components $W(s_1)$ and $W(s_2)$, $\chi_h(u)$ is defined as $\chi_h(u)= \lim_{u\to1}\Pr(W_{s_1}(W(s_1))>u\mid F_{s_2}(W(s_2))>u)$. The function `chiplot()` in `eFCM` plots model-based estimated of $\chi_h(u)$ alongside their empirical counterpart for a range of values of $u$. The package also provides two methods for pointwise confidence intervals: a normal approximation to the MLE or bootstrap resampling. Both approaches are illustrated below. ```{r, eval=T, fig.width = 4, fig.height = 4.5, fig.align='center'} chiplot(fit, method = "hessian", ylim = c(0, 1)) ``` ```{r, eval=T, fig.width = 4, fig.height = 4.5, fig.align='center'} chiplot(fit, method = "boot", ylim = c(0, 1)) ``` ## Simulation from the Fitted Model To generate synthetic precipitation fields consistent with the estimated extremal dependence structure, we can simulate from the fitted model as follows: ```{r, eval=FALSE} lbda <- fit$par[1] delta <- fit$par[2] dist <- rdist.earth(fit$coord) sim <- rmfcm(lbda, delta, dist, n = 1e4) ``` ## Summary This vignette has demonstrated the complete pipeline for fitting the exponential factor copula model using the `eFCM` package. It covered preprocessing of data (in this case, climate model outputs), model estimation, diagnostic checks, and simulation. For a more detailed discussion of the method and its application to extreme event attribution, see Li and Castro-Camilo (2025+). ## References Castro-Camilo, D., & Huser, R. (2020). Local likelihood estimation of complex tail dependence structures, with application to U.S. precipitation extremes. *Journal of the American Statistical Association*, *115*(531), 1037–1054. Li, M., & Castro-Camilo, D. (2025+). On the importance of tail assumptions in climate extreme event attribution. arXiv preprint.