--- title: "How to start" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{How to start} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Getting started The basic idea of this vignette is to illustrate to users how to use the flexFitR package. We'll start with a very basic example: a simple linear regression. Although this example is not the primary focus of the package, it will serve to demonstrate its use. ## 1. Simple linear regression In this example, we'll work with a small dataset consisting of 6 observations, where X is the independent variable and Y is the dependent variable. ```{r, warning=FALSE, message=FALSE} library(flexFitR) library(dplyr) library(ggpubr) ``` ```{r , warning=FALSE, message=FALSE, fig.alt = "Plot x,y"} dt <- data.frame(X = 1:6, Y = c(12, 16, 44, 50, 95, 100)) plot(explorer(dt, X, Y), type = "xy") ``` First, we define an objective function. In this case, the function `fn_lm` will represent the linear regression, where b is the intercept and m is the slope of the regression. ```{r, warning=FALSE, message=FALSE} fn_lm <- function(x, b, m) { y <- b + m * x return(y) } ``` The `plot_fn` function, which is integrated into the package, allows us to plot any function with the parameters provided. This is useful for visualizing the shape of the function before fitting the model to the data. ```{r, fig.alt = "Plot function"} plot_fn(fn = "fn_lm", params = c(b = 10, m = 5)) ``` To fit the model, we use the `modeler` function. In this function, we pass x as the independent variable, y as the dependent variable, and then a vector of parameters where we assign initial values to our coefficient b and coefficient m. ```{r, warning=FALSE, message=FALSE} mod <- dt |> modeler( x = X, y = Y, fn = "fn_lm", parameters = c(b = -5, m = 10) ) mod ``` Once the model is fitted, we can examine the output, extract the estimated parameters, make some plots, and predict new x values. ```{r, fig.alt = "Plot evolution", fig.width= 8, fig.height=4} a <- plot(mod, color = "blue", title = "Raw data") b <- plot(mod, type = 4, n_points = 200, color = "black") ggarrange(a, b) ``` In order to get the coefficients with their variance-covariance matrix we make use of the `coef` and `vcov` function, which only takes the model object as an argument. ```{r} coef(mod) ``` ```{r} vcov(mod) ``` Finally, we can make predictions using the predict function, which takes the fitted model as an object and X as the value for which we want to make the prediction. ```{r} predict(mod, x = 4.5) ``` We can compare this with the lm function in R, which will give results similar to those obtained with our package. ### Comparison with `lm` ```{r, warning=FALSE, message=FALSE} mo <- lm(Y ~ X, data = dt) ``` ```{r} summary(mo)$coefficients ``` ```{r} vcov(mo) predict(mo, newdata = data.frame(X = 4.5), se.fit = TRUE) ``` While the previous example was fairly simple, we can consider a more complex scenario where we need to fit not just one function, but hundreds of functions for several groups. This can be achieved using the `grp` argument in the `modeler` function. Additionally, we can parallelize these processes by setting the `parallel` argument to `TRUE` and defining the number of cores to use. It’s important to note that depending on the functions defined by the user, some parameters may need to be constrained, such as being required to be greater than or less than zero. In other cases, certain parameters might need to be fixed at known values. In these more complex situations, where we have many curves to fit and are working with complex functions—whether non linear regressions with specific parameter constraints or cases where some parameters are fixed for each group—modeler offers extensive flexibility. ## 2. Piece-wise regression The following example, although still simple, represents a slightly more complex function with a greater number of parameters. In this case, we have a piece-wise regression, parameterized by `t1`, `t2`, and `k`, and defined by the following expression: ```{r, warning=FALSE, message=FALSE} fun <- function(t, t1 = 45, t2 = 80, k = 0.9) { if (t < t1) { y <- 0 } else if (t >= t1 && t <= t2) { y <- k / (t2 - t1) * (t - t1) } else { y <- k } return(y) } ``` Before fitting the model, let's take a look at the example dataset. ```{r, fig.alt = "Plot x,y"} dt <- data.frame( time = c(0, 29, 36, 42, 56, 76, 92, 100, 108), variable = c(0, 0, 0.67, 15.11, 77.38, 99.81, 99.81, 99.81, 99.81) ) plot(explorer(dt, time, variable), type = "xy") ``` We can make a plot of the piecewise function and then fit the model using the `modeler` function. ```{r fig.alt = "Plot function"} plot_fn(fn = "fun", params = c(t1 = 25, t2 = 70, k = 90)) ``` ```{r, warning=FALSE, message=FALSE} mod_1 <- dt |> modeler( x = time, y = variable, fn = "fun", parameters = c(t1 = 45, t2 = 80, k = 90) ) mod_1 ``` After fitting the model, we can examine the results, plot the fitted curve, extract the coefficients and their associated p-values, obtain the variance-covariance matrix, and make predictions for unknown values of x. ```{r fig.alt = "Plot evolution"} plot(mod_1) ``` ```{r} # Coefficients coef(mod_1) ``` ```{r} # Variance-Covariance Matrix vcov(mod_1) ``` ```{r} # Making predictions predict(mod_1, x = 45) ``` Finally, we illustrate how to provide different initial values to the function when dealing with multiple groups, and we also show how to fix some parameters of the objective function. ### Providing Initial values In this example, we don't have a grouping variable. However, by default, the function assigns a unique identifier (`uid`) to the dataset. Because of this, we need to specify `uid = 1` for the initial values and fixed parameters. If there is only one group, you only need to modify the parameters argument accordingly. This approach is primarily for illustrative purposes. ```{r, warning=FALSE, message=FALSE} init <- data.frame(uid = 1, t1 = 20, t2 = 30, k = 0.8) mod_2 <- dt |> modeler( x = time, y = variable, fn = "fun", parameters = init ) mod_2 coef(mod_2) ``` ### Fixing parameters ```{r, warning=FALSE, message=FALSE, fig.alt = "Plot evolution"} fix <- data.frame(uid = 1, k = 98) mod_3 <- dt |> modeler( x = time, y = variable, fn = "fun", parameters = c(t1 = 45, t2 = 80, k = 90), fixed_params = fix ) mod_3 coef(mod_3) plot(mod_3) ``` ```{r} rbind(metrics(mod_1), metrics(mod_2), metrics(mod_3)) ``` This vignette provided a basic introduction to using the flexFitR package, starting with simple examples such as linear regression and piecewise regression. The goal was to demonstrate the fundamental features and flexibility of the package. However, more complex situations can arise when working with high-throughput phenotypic (HTP) data, which involve multiple groups, parameter constraints, and advanced modeling scenarios. These more complex situations are illustrated in the other vignettes, which use real HTP data to showcase the full capabilities of the flexFitR package.