This vignette shows how to use the package to emulate a feed-forward system of three synthetic computer models.

```
library(dgpsi)
init_py()
```

```
## Connecting to Python ... done
## Importing required Python modules ... done
## The Python environment for 'dgpsi' is successfully loaded.
```

We consider the following synthetic system

that involves three models defined by:

```
# Model 1
<- function(x) {
f1 sin(7.5*x)+1)/2
(
}# Model 2
<- function(x) {
f2 2/3*sin(2*(2*x - 1))+4/3*exp(-30*(2*(2*x-1))^2)-1/3
}# Model 3
<- function(x) {
f3 1]*x[2]^2
x[
}# Linked Model
<- function(x) {
f123 f3(c(f1(x),f2(f1(x))))
}
```

We then generate 10 training data points for Model 1 and 15 training data points for Model 2 and 3:

```
# Training data for Model 1
<- seq(0, 1, length = 10)
X1 <- sapply(X1, f1)
Y1 # Training data for Model 2
<- seq(0, 1, length = 15)
X2 <- sapply(X2, f2)
Y2 # Training data for Model 3
<- cbind(X2, Y2)
X3 <- apply(X3, f3, MARGIN = 1) Y3
```

We construct and train a GP emulator with MatÃ©rn-2.5 kernel:

`<- gp(X1, Y1, name = "matern2.5") m1 `

```
## Auto-generating a GP structure ... done
## Initializing the GP emulator ... done
## Training the GP emulator ... done
```

We now validate the trained GP emulator by `plot()`

by LOO
(alternatively, one can first use `validate()`

to store the
LOO results before plotting with `plot()`

):

`plot(m1)`

```
## Initializing ... done
## Post-processing LOO results ... done
## Plotting ... done
```

Since we will use `m1`

later for linked emulation, we must
specify how it links to the emulators in the feeding layer through
`linked_idx`

in `dgp()`

. Since `m1`

is
in the first layer of the system, `linked_idx`

refers to the
column indices of the global input `X`

that will feed into
`m1`

, and therefore we should have set
`linked_idx = c(1)`

in `dgp()`

when we first
constructed the emulator. Instead of re-training the emulator to provide
this information, we can simply use `set_linked_idx()`

to add
this information to `m1`

:

`<- set_linked_idx(m1, c(1)) m1 `

We construct a two-layered DGP emulator with MatÃ©rn-2.5 kernels to emulate the Model 2:

`<- dgp(X2, Y2, depth = 2, name = "matern2.5", linked_idx = c(1)) m2 `

```
## Auto-generating a 2-layered DGP structure ... done
## Initializing the DGP emulator ... done
## Training the DGP emulator:
## Iteration 500: Layer 2: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 500/500 [00:02<00:00, 232.30it/s]
## Imputing ... done
```

Since all outputs of emulators in the first layer is from
`m1`

and `m2`

uses the output from `m1`

as its input, we set `linked_idx = c(1)`

in
`dgp()`

above.

The following plot visualizes the LOO of the trained DGP emulator
`m2`

:

`plot(m2)`

```
Initializing ... done
Post-processing LOO results ... done
Plotting ... done
```

We now construct a three-layered DGP emulator with MatÃ©rn-2.5 kernels to emulate the Model 3:

`<- dgp( X3, Y3, depth = 3, name = "matern2.5", linked_idx = list( c(1), c(1) ) ) m3 `

```
## Auto-generating a 3-layered DGP structure ... done
## Initializing the DGP emulator ... done
## Training the DGP emulator:
## Iteration 500: Layer 3: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 500/500 [00:06<00:00, 72.56it/s]
## Imputing ... done
```

Since `m3`

takes both outputs from `m1`

in
layer 1 and `m2`

in layer 2, we set
`linked_idx = list( c(1), c(1) )`

. The first vector in the
list says that the output dimension 1 from all emulators (i.e.,
`m1`

) in the first layer feeds into `m3`

and the
second vector in the list says that the output dimension 1 from all
emulators (i.e., `m2`

) in the second layer feeds into
`m3`

. Note that the dimensions in `X3`

is ordered
in consistent with `linked_idx`

, i.e., the first and second
columns of `X3`

correspond to the output of `m1`

and `m2`

respectively.

The following plot visualizes the LOO of the trained DGP emulator
`m3`

:

`plot(m3)`

```
Initializing ... done
Post-processing LOO results ... done
Plotting ... done
```

With the GP emulator `m1`

(for Model 1), the DGP emulator
`m2`

(for Model 2), and the DGP emulator `m3`

(for
Model 3) at hand, we are now ready to build the linked emulator by first
using `combine()`

to construct the system hierarchy
`struc`

and then applying `lgp()`

for the linked
emulator.

```
<- combine(list(m1), list(m2), list(m3))
struc <- lgp(struc) m_link
```

Note that each layer of an emulator system is represented by a list.
In our case, each layer contains a single emulator. Therefore, we need
to put the corresponding emulator in a list using
`list()`

.

We can check the relations between emulators in different layers by
applying `summary()`

to `m_link`

:

`summary(m_link)`

```
## +-----------+--------------+------+----------------------------+-----------------+
## | Layer No. | Emulator No. | Type | Connection | External Inputs |
## +-----------+--------------+------+----------------------------+-----------------+
## | Layer 1 | Emu 1 | GP | Global input: [1] | No |
## | Layer 2 | Emu 1 | DGP | Emu 1 in Layer 1: output 1 | No |
## | Layer 3 | Emu 1 | DGP | Emu 1 in Layer 1: output 1 | No |
## | | | | Emu 1 in Layer 2: output 1 | |
## +-----------+--------------+------+----------------------------+-----------------+
## 1. 'Connection' gives the indices of emulators and the associated output dimensions that are linked to the emulator referred by 'Layer No.' and 'Emulator No.'.
## 2. 'External Inputs' indicates if the emulator (referred by 'Layer No.' and 'Emulator No.') has external inputs that are not provided by the feeding emulators.
```

For comparison, we construct a GP emulator for the whole system by generating 15 training data points from Model 1 and Model 2:

```
<- seq(0, 1, length = 15)
X_gp <- sapply(X_gp, f123)
Y_gp <- gp(X_gp, Y_gp, name = 'matern2.5') m_gp
```

```
## Auto-generating a GP structure ... done
## Initializing the GP emulator ... done
## Training the GP emulator ... done
```

Finally, we make predictions from both the GP emulator and the linked emulator at 300 testing data points over \([0,1]\):

```
# Testing input
<- seq(0, 1, length = 300)
test_x # Testing output
<- sapply(test_x, f123) test_y
```

```
# linked emulator
<- predict(m_link, x = test_x)
m_link # GP emulator
<- predict(m_gp, x = test_x) m_gp
```

and plot the predictions to compare their emulation performances:

```
# extract predictive means and variances from the linked emulator
<- m_link$results$mean$emulator1
mu <- sqrt(m_link$results$var$emulator1)
sd <- mu + 2*sd
up <- mu - 2*sd
lo
# extract predictive means and variances from the GP emulator
<- m_gp$results$mean
mu_gp <- sqrt(m_gp$results$var)
sd_gp <- mu_gp + 2*sd_gp
up_gp <- mu_gp - 2*sd_gp
lo_gp
par(cex=0.7, mar = c(5, 5, 0.9, 0.9)+0.2)
# GP emulator
par(fig = c(0, 0.5, 0.2, 0.8))
plot(test_x, mu_gp, type = 'l', lty = 2, lwd = 1.5, col = 'black', main = "GP Emulator", xlab = 'x1', ylab = 'y3', ylim=c(-0.1,0.65))
polygon(c(test_x, rev(test_x)), c(up_gp, rev(lo_gp)), col = 'grey80', border = F)
lines(test_x, test_y, type = 'l', col = "#D55E00", lwd = 2)
lines(test_x, mu_gp, type = 'l', lty = 2, lwd = 1.5, col = 'black')
# linked emulator
par(fig = c(0.5, 1, 0.2, 0.8), new = TRUE)
plot(test_x, mu, type = 'l', lty = 2, lwd = 1.5, col = 'black', main = "Linked Emulator", xlab = 'x1', ylab = 'y3', ylim=c(-0.1,0.65))
polygon(c(test_x, rev(test_x)), c(up, rev(lo)), col = 'grey80', border = F)
lines(test_x, test_y, type = 'l', col = "#D55E00", lwd = 2)
lines(test_x, mu, type = 'l', lty = 2, lwd = 1.5, col = 'black')
```

It can be seen from the above plots that the linked emulator outperforms the GP emulator with significantly better mean predictions and predictive uncertainties.

In real-life applications, we are rarely able to generate this many testing data points from the underlying computer simulators to evaluate the emulators over the whole input space. However, we are still able to retain some available realizations from the computer simulators for the linked emulator validation. Say we were able to afford 20 runs of the above linked computer system:

```
# OOS testing input
<- sample(seq(0, 1, length = 300), 20)
test_x_oos # OOS testing output
<- sapply(test_x_oos, f123) test_y_oos
```

Then, we can conduct OOS validations for the GP emulator:

`plot(m_gp, test_x_oos, test_y_oos, style = 2)`

```
## Initializing ... done
## Post-processing OOS results ... done
## Plotting ... done
```

and the linked emulator:

`plot(m_link, test_x_oos, test_y_oos, style = 2)`

```
## Initializing ... done
## Post-processing OOS results ... done
## Plotting ... done
```

which show that the linked emulator outperforms the GP emulator with significantly better predictive accuracy and lower NRMSE.