This document gives an overview of the functionality provided by the
R package `APCtools`

.

Age-Period-Cohort (APC) analysis is used to disentangle observed trends (e.g.Â of social, economic, medical or epidemiological data) to enable conclusions about the developments over three temporal dimensions:

- Age, representing the developments associated with chronological age over someones life cycle.
- Period, representing the developments over calendar time which affect all age groups simultaneously.
- Cohort, representing the developments observed over different birth cohorts and generations.

The critical challenge in APC analysis is that these main components are linearly dependent: \[ cohort = period - age \]

Accordingly, flexible methods and visualization techniques are needed
to properly disentagle observed temporal association structures. The
`APCtools`

package comprises different methods that tackle
this problem and aims to cover all steps of an APC analysis. This
includes state-of-the-art descriptive visualizations as well as
visualization and summary functions based on the estimation of a
generalized additive regression model (GAM). The main functionalities of
the package are highlighted in the following.

For details on the statistical methodology see Weigert et
al.Â (2021) or our corresponding research
poster. The *hexamaps* (hexagonally binned heatmaps) are
outlined in Jalal
& Burke (2020).

Before we start, letâ€™s load the relevant packages for the following analyses.

```
library(APCtools)
library(dplyr) # general data handling
library(mgcv) # estimation of generalized additive regression models (GAMs)
library(ggplot2) # data visualization
library(ggpubr) # arranging multiple ggplots in a grid with ggarrange()
# set the global theme of all plots
theme_set(theme_minimal())
```

APC analyses require long-term panel or repeated cross-sectional
data. The package includes two exemplary datasets on the travel behavior
of German tourists (dataset `travel`

) and the number of
unintentional drug overdose deaths in the United States
(`drug_deaths`

). See the respective help pages
`?travel`

and `?drug_deaths`

for details.

In the following, we will use the `travel`

dataset to
investigate if travel distances of the main trip of German travelers
mainly change over the life cycle of a person (age effect), macro-level
developments like decreasing air travel prices in the last decades
(period effect) or the generational membership of a person, which is
shaped by similar socialization and historical experiences (cohort
effect).

`data(travel)`

Different functions are available for descriptively visualizing observed structures. This includes plots for the marginal distribution of some variable of interest, 1D plots for the development of some variable over age, period or cohort, as well as density matrices that visualize the development over all temporal dimensions.

The marginal distribution of a variable can be visualized using
`plot_density`

. Metric variables can be plotted using a
density plot or a boxplot, while categorical variables can be plotted
using a bar chart.

```
gg1 <- plot_density(dat = travel, y_var = "mainTrip_distance", log_scale = TRUE)
gg2 <- plot_density(dat = travel, y_var = "mainTrip_distance", log_scale = TRUE,
plot_type = "boxplot")
gg3 <- plot_density(dat = travel, y_var = "household_size")
ggpubr::ggarrange(gg1, gg2, gg3, nrow = 1)
```

Plotting the distribution of a variable against age, period or cohort
is possible with function `plot_variable`

. The distribution
of metric and categorical variables is visualized using boxplots or line
charts (see argument `plot_type`

) and bar charts,
respectively. The latter by default show relative frequencies, but can
be changed to show absolute numbers by specifying argument
`geomBar_position = "stack"`

.

```
plot_variable(dat = travel, y_var = "mainTrip_distance",
apc_dimension = "period", plot_type = "line", ylim = c(0,1000))
```

`plot_variable(dat = travel, y_var = "household_size", apc_dimension = "period")`

To include all temporal dimensions in one plot, `APCtools`

contains function `plot_densityMatrix`

. In Weigert et
al.Â (2021), this plot type was referred to as *ridgeline matrix*
when plotting multiple density plots for a metric variable. The basic
principle of a density matrix is to (i) visualize two of the temporal
dimensions on the x- and y-axis (specified using the argument
`dimensions`

), s.t. the third temporal dimension is
represented on the diagonals of the matrix, and (ii) to categorize the
respective variables on the x- and y-axis in meaningful groups. The
function then creates a grid, where each cell contains the distribution
of the selected `y_var`

variable in the respective
category.

By default, age and period are depicted on the x- and y-axis,
respectively, and cohort on the diagonals. The categorization is defined
by specifying two of the arguments `age_groups`

,
`period_groups`

and `cohort_groups`

.

```
age_groups <- list(c(80,89),c(70,79),c(60,69),c(50,59),
c(40,49),c(30,39),c(20,29))
period_groups <- list(c(1970,1979),c(1980,1989),c(1990,1999),
c(2000,2009),c(2010,2019))
plot_densityMatrix(dat = travel,
y_var = "mainTrip_distance",
age_groups = age_groups,
period_groups = period_groups,
log_scale = TRUE)
```

To highlight the effect of the variable depicted on the diagonal
(here: cohort), different diagonals can be highlighted using argument
`highlight_diagonals`

.

```
plot_densityMatrix(dat = travel,
y_var = "mainTrip_distance",
age_groups = age_groups,
period_groups = period_groups,
highlight_diagonals = list("born 1950 - 1959" = 8,
"born 1970 - 1979" = 10),
log_scale = TRUE)
```