---
title: "Modeling with constraints"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Modeling with constraints}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

This vignette demonstrates how to apply parameter constraints when modeling 
biological processes using {flexFitR}. Constraints can help ensure that 
parameter estimates remain within realistic or biologically meaningful ranges, 
improving both the interpretability and reliability of model outcomes. 

## Introduction to Modeling with Constraints

In many biological models, certain relationships between parameters are 
expected. For example:

* Some parameters should not exceed certain values (e.g., maximum growth rates).
* Some parameters should maintain specific relationships with each other 
(e.g., one stage occurring before another in time).

This vignette demonstrates how to apply these types of constraints in {flexFitR} 
to guide the optimization process.

### Example Case

For this example, we use the Green Leaf Index (GLI) derived from UAV imagery to
model plant emergence, canopy closure, and senescence. 
The parameters we are interested in include:

* t1: Emergence time
* t2: Canopy closure time
* t3: Senescence onset

Our expectation is that $0 < t1 < t2 < t3$. We will apply constraints to ensure 
this relationship hold.

## Loading libraries

```{r, warning=FALSE, message=FALSE }
library(flexFitR)
library(dplyr)
library(kableExtra)
library(ggpubr)
library(purrr)
```

## 1. Exploring data

We begin with the `explorer` function, which provides basic statistical 
summaries and visualizations to help understand the temporal evolution of
each plot.

```{r}
data(dt_potato)
explorer <- explorer(dt_potato, x = DAP, y = c(GLI), id = Plot)
```

```{r, fig.width= 8, fig.height=3, fig.alt="plot corr"}
p1 <- plot(explorer, type = "evolution", return_gg = TRUE, add_avg = TRUE)
p2 <- plot(explorer, type = "x_by_var", return_gg = TRUE)
ggarrange(p1, p2, nrow = 1)
```

```{r}
kable(mutate_if(explorer$summ_vars, is.numeric, round, 2))
```

## 2. Regression function

After exploring the data, we define the regression function. Here we use a 
linear-plateau-linear function with five parameters: 
t1, t2, t3, k, and $\beta$. The function can be expressed mathematically as
follows:

`fn_lin_pl_lin()`

\begin{equation}
f(t; t_1, t_2, t_3, k, \beta) =
\begin{cases}
0 & \text{if } t < t_1 \\
\dfrac{k}{t_2 - t_1} \cdot (t - t_1) & \text{if } t_1 \leq t \leq t_2 \\
k & \text{if } t_2 \leq t \leq t_3 \\
k + \beta \cdot (t - t_3) & \text{if } t > t_3
\end{cases}
\end{equation}

```{r, fig.width= 8, fig.height=4, fig.alt="plot fn"}
plot_fn(
  fn = "fn_lin_pl_lin",
  params = c(t1 = 38.7, t2 = 62, t3 = 90, k = 0.32, beta = -0.01),
  interval = c(0, 108),
  color = "black",
  base_size = 15
)
```

To impose constraints, we can reformulate the function. For instance, 
if we want to ensure that $t3 \geq t2$, we introduce dt as the difference between t3 
and t2:

\begin{equation}
f(t; t_1, t_2, dt, k, \beta) =
\begin{cases}
0 & \text{if } t < t_1 \\
\dfrac{k}{t_2 - t_1} \cdot (t - t_1) & \text{if } t_1 \leq t \leq t_2 \\
k & \text{if } t_2 \leq t \leq (t_2 + dt) \\
k + \beta \cdot (t - (t_2 + dt)) & \text{if } t > (t_2 + dt)
\end{cases}
\end{equation}

To enforce $dt > 0$ and $\beta < 0$ (i.e., a non-positive slope at the end of 
the curve), we specify bounds in the modeler function as follows:

```{r}
# Define constraints and bounds for the model
lower_bounds <- c(t1 = 0, t2 = 0, dt = 0, k = 0, beta = -Inf)
upper_bounds <- c(t1 = Inf, t2 = Inf, dt = Inf, k = Inf, beta = 0)
# Initial values
initial_vals <- c(t1 = 38, t2 = 62, dt = 28, k = 0.32, beta = -0.01)
```

## 3. Fitting Models with Constraints

We fit the model with these constraints by passing lower and upper arguments to 
`modeler`. In this vignette, we fit the model for plots 195 and 40 as a `subset`
of the total 196 plots.

```{r, warning=FALSE, message=FALSE}
mod_1 <- dt_potato |>
  modeler(
    x = DAP,
    y = GLI,
    grp = Plot,
    fn = "fn_lin_pl_lin2",
    parameters = initial_vals,
    lower = lower_bounds,
    upper = upper_bounds,
    method = c("nlminb", "L-BFGS-B"),
    subset = c(195, 40)
  )
```

Here:

* x specifies the days after planting (DAP),
* y is the GLI variable to be modeled
* grp enables group analysis across multiple plots
* parameters are the initial parameter values
* method specifies the optimization methods to evaluate

After fitting, we can inspect the model summary and visualize the fit using
the `plot` function:

```{r}
print(mod_1)
```

```{r, fig.width= 8, fig.height=4, fig.alt="plot fit 1"}
plot(mod_1, id = c(195, 40))
kable(mod_1$param)
```

## 3.1. Extracting model coefficients and uncertainty measures

Once the model is fitted, we can extract key statistical information, such as 
coefficients, standard errors, confidence intervals, and the variance-covariance
matrix for each plot. These metrics help evaluate parameter reliability and 
assess uncertainty.

The functions `coef`, `confint`, and `vcov` are used as follows:

* **coef**: Extracts the estimated coefficients for each group.
* **confint**: Provides the confidence intervals for the parameter estimates.
* **vcov**: Returns the variance-covariance matrix, which can be used 
to understand the relationships between the estimates and their variability.

```{r}
coef(mod_1, id = 40)
```
```{r}
confint(mod_1, id = 40)
```

```{r}
vcov(mod_1, id = 40)
```

## 4. Plotting options

Using `type = 2` in the `plot` function generates a coefficients plot.
This allows us to view the estimated coefficients and their associated
confidence intervals for each group.

```{r, fig.width= 8, fig.height=4, fig.alt="plot coef"}
plot(mod_1, type = 2, id = c(195, 40), label_size = 8)
```

Another option (`type = 4`) shows the fitted curve (black line), confidence
interval (blue-dashed line), and prediction interval (red-dashed line).
Additionally, setting type = 5 displays the first derivative, indicating the
rate of change over time.

```{r,  fig.width= 8, fig.height=4, fig.alt="plot derivatives"}
a <- plot(mod_1, type = 4, color = "black", title = "Fitted Curve + CIs & PIs")
b <- plot(mod_1, type = 5, color = "black")
ggarrange(a, b)
```

## 5. Conclusion

This vignette showed how to apply constraints in {flexFitR} models to better
capture biological realities and improve parameter estimation. Constraints can 
be an essential tool for ensuring that models produce interpretable and
meaningful results.