| Type: | Package | 
| Title: | Joint Models of Survival and Multivariate Longitudinal Data | 
| Version: | 0.4.5 | 
| Date: | 2024-10-05 | 
| Description: | Fit joint models of survival and multivariate longitudinal data. The longitudinal data is specified by generalised linear mixed models. The joint models are fit via maximum likelihood using an approximate expectation maximisation algorithm. Bernhardt (2015) <doi:10.1016/j.csda.2014.11.011>. | 
| License: | GPL-3 | 
| Depends: | R (≥ 3.6.0), glmmTMB, survival | 
| Imports: | Rcpp (≥ 1.0.6), MASS, methods, mvtnorm, pracma, reformulas, stats, statmod, xtable | 
| LinkingTo: | Rcpp, RcppArmadillo | 
| Encoding: | UTF-8 | 
| RoxygenNote: | 7.2.3 | 
| LazyData: | true | 
| URL: | https://github.com/jamesmurray7/gmvjoint | 
| BugReports: | https://github.com/jamesmurray7/gmvjoint/issues | 
| NeedsCompilation: | yes | 
| Packaged: | 2024-10-05 21:00:06 UTC; lshjm44 | 
| Author: | James Murray [aut, cre] | 
| Maintainer: | James Murray <james.murray@lshtm.ac.uk> | 
| Repository: | CRAN | 
| Date/Publication: | 2024-10-05 22:40:02 UTC | 
Joint Models of Survival and Multivariate Longitudinal Data
Description
gmvjoint allows the user to fit joint models of survival and multivariate longitudinal data. The 
longitudinal data is specified by generalised linear mixed models (GLMMs). The joint models 
are fit via maximum likelihood using an approximate EM algorithm first proposed by Bernhardt et
al. (2015). The GLMMs are specified using the same syntax as for package glmmTMB Brooks et
al. (2017). The joint models themselves are then the  flexible extensions to those in e.g.
Wulfsohn and Tsiatis (1997). The user is able to simulate data under many different response
types.
Author(s)
James Murray <j.murray7@ncl.ac.uk>
References
Bernhardt PW, Zhang D and Wang HJ. A fast EM Algorithm for Fitting Joint Models of a Binary Response to Multiple Longitudinal Covariates Subject to Detection Limits. Computational Statistics and Data Analysis 2015; 85; 37–53
Mollie E. Brooks, Kasper Kristensen, Koen J. van Benthem, Arni Magnusson, Casper W. Berg, Anders Nielsen, Hans J. Skaug, Martin Maechler and Benjamin M. Bolker (2017). glmmTMB Balances Speed and Flexibility Among Packages for Zero-inflated Generalized Linear Mixed Modeling. The R Journal, 9(2), 378-400.
Murray, J and Philipson P. A fast approximate EM algorithm for joint models of survival and multivariate longitudinal data.Computational Statistics and Data Analysis 2022
Wulfsohn MS, Tsiatis AA. A joint model for survival and longitudinal data measured with error. Biometrics. 1997; 53(1), 330-339.
Primary biliary cirrhosis data
Description
Primary biliary cirrhosis (PBC) data. PBC is a chronic liver disease which affects the bile 
ducts of the liver, complications of which can ultimately lead to death. The longitudinal 
profile of numerous biomarkers were observed for 312 patients at the Mayo Clinic between 1974
and 1984  with patients assigned to either the active (D-penicillamine, n=154 (50.6
placebo treatment arm (Murtaugh 1994). The data is publicly available in numerous places, 
including joineRML and survival. The presence of many longitudinal biomarkers 
of clinical interest as well as an event-time has lead to the PBC data becoming
widely used in literature.
Usage
data('PBC')
Format
data.frame with 312 patients and 19 variables:
- id
- Subject identifier 
- survtime
- Survival time in years 
- drug
- Binary indicator covariate: was the patient assigned active ( - drug=1) or placebo?
- sex
- Binary indicator covariate: Takes value 1 if the subject is female, and zero if male. 
- time
- Time of visit (0=baseline). 
- ascites
- Binary response variable. Takes value 1 if accumulation of fluid in abdomen ("ascites") present. 
- hepatomegaly
- Binary response variable. Takes value 1 if enlarged liver ("hepatomegaly") present. 
- spiders
- Binary response variable. Takes value 1 if malformed blood vessels in skin ("hepatomegaly") present. 
- edema
- Factor variable describing edema therapy, see - pbcseq.
- serBilir
- Serum bilirubin (measured in mg/dl). 
- serChol
- Serum cholesterol (measured in mg/dl). 
- album
- Serum albumin (measured in mg/dl). 
- alkaline
- Alkaline phosphotase (measured in U/liter). 
- SGOT
- Aspartate aminotransferase (measured in U/liter). 
- platelets
- Platelet count per cubic ml/1000. 
- histologic
- Histologic stage of disease, see - pbcseq.
- status
- Survival status, - status=1if the subject experienced mortality and- =0if censored.
- age
- Standardised age at baseline visit. 
Details
Nine longitudinal biomarkers exist with varying degrees of completeness in the data.
Source
References
Murtaugh PA, Dickson ER, Van Dam GM, Malinchoc M, Grambsch PM, Langworthy AL, Gips CH. Primary biliary cirrhosis: Prediction of short-term survival based on repeated patient visits. Hepatology 1994; 20(1); 126-134.
Receiver Operator Characteristics (ROC) for a joint model.
Description
Using longitudinal information available up to a time, establish diagnostic capabilities (ROC, AUC and Brier score) of a fitted joint model.
Usage
ROC(fit, data, Tstart, delta, control = list(), progress = TRUE, boot = FALSE)
Arguments
| fit | a joint model fit by the   | 
| data | the data to which the original  | 
| Tstart | The start of the time window of interest,  | 
| delta | scalar denoting the length of time interval to check for failure times. | 
| control | list of control arguments to be passed to  | 
| progress | should a progress bar be shown, showing the current progress of the ROC
function (
to  | 
| boot | logical. Not currently used, legacy argument. | 
Value
A list of class ROC.joint consisting of: 
- Tstart
- numeric denoting the start of the time window of interest; all dynamic predictions generated used longitudinal information up-to time - T_{\mathrm{start}}.
- delta
- scalar which denotes length of interval to check, such that the window is defined by - [T_{\mathrm{start}}, T_{\mathrm{start}}, + \delta].
- candidate.u
- candidate vector of failure times to calculate dynamic probability of surviving for each subject alive in - dataat time- T_{\mathrm{start}}.
- window.failures
- numeric denoting the number of observed failures in - [T_{\mathrm{start}}, T_{\mathrm{start}}, + \delta].
- Tstart.alive
- numeric denoting the risk set at - Tstart.
- metrics
- a - data.framecontaining probabilistic- thresholdswith:- TPtrue positives;- FNfalse negatives;- FPfalse positives;- TNtrue negatives;- TPRtrue positive rate (sensitivity);- FPRfalse positive rate (1-specificity);- Accaccuracy;- PPVpositive predictive value (precision);- NPVnegative predictive value;- F1sF1 score and- JYouden's J statistic.
- AUC
- the area under the curve. 
- BrierScore
- The Brier score. 
- PE
- The predicted error (taking into account censoring), loss function: square. 
- MH.acceptance
- Raw acceptance percentages for each subject sampled. 
- MH.acceptance.bar
- mean acceptance of M-H scheme across all subjects. 
- simulation.info
- list containing information about call to - dynPred.
Author(s)
James Murray (j.murray7@ncl.ac.uk).
See Also
dynPred, and plot.ROC.joint.
Examples
data(PBC)
PBC$serBilir <- log(PBC$serBilir)
long.formulas <- list(serBilir ~ drug * time + (1 + time|id))
surv.formula <- Surv(survtime, status) ~ drug
family <- list('gaussian')
fit <- joint(long.formulas, surv.formula, PBC, family)
(roc <- ROC(fit, PBC, Tstart = 8, delta = 2, control = list(nsim = 25)))
plot(roc)
Anova for joint models
Description
Perform a likelihood ratio test between two (nested) joint 
models. The user must decide whether the models are truly nested.
Usage
## S3 method for class 'joint'
anova(object, object2, ...)
Arguments
| object | a joint model fit by the  | 
| object2 | a joint model fit by the  | 
| ... | additional arguments (none used). | 
Value
A list of class anova.joint with elements 
- mod0
- the name of - object.
- l0
- the log-likelihood of the nested model, i.e. fit under the null. 
- AIC0
- AIC for - object.
- BIC0
- BIC for - object.
- mod1
- the name of - object2.
- l1
- the log-likelihood under the alternative hypothesis. 
- AIC1
- AIC for - object2.
- BIC1
- BIC for - object2.
- LRT
- likelihood ratio test statistic. 
- p
- the p-value of - LRT.
- warnSurv
- internal - logical value for printing difference in survival models. 
- warnRanefs
- internal - logical value for printing difference in random effects specifications. 
Author(s)
James Murray (j.murray7@ncl.ac.uk)
See Also
joint and logLik.joint.
Examples
rm(list=ls())
data(PBC)
# Compare quadratic vs linear time specification for log(serum bilirubin) -----
PBC$serBilir <- log(PBC$serBilir)
long.formulas1 <- list(serBilir ~ drug * time + (1 + time|id))
long.formulas2 <- list(serBilir ~ drug * (time + I(time^2)) + (1 + time + I(time^2)|id))
surv.formula <- Surv(survtime, status) ~ drug
family <- list('gaussian')
# Fit the two competing models (fit is nested in fit2) ------------------------
fit <- joint(long.formulas1, surv.formula, PBC, family, 
             control = list(verbose = FALSE))
fit2 <- joint(long.formulas2, surv.formula, PBC, family, control = list(verbose = FALSE))
anova(fit, fit2)
# Quadratic terms improve fit significantly. 
Bootstrapping a joint object
Description
Use an existing model fit by joint along with the data object originally
used and obtain a mean estimate, standard errors and 95% confidence interval using the
bootstrap. The original data is resampled by subject, not by observation.
Usage
boot.joint(
  fit,
  data,
  boot.size = NULL,
  nboot = 100L,
  replace = TRUE,
  progress = TRUE,
  use.MLEs = TRUE,
  control = list()
)
Arguments
| fit | a joint model fit by the  | 
| data | the original data used to fit the above joint model. | 
| boot.size | integer, specifies the number of subjects to resample in the bootstrapping
approach. The default value is  | 
| nboot | integer, specifies the number of bootstrap samples, default value is 
 | 
| replace | logical, should sampling be done with replacement? Defaults to 
 | 
| progress | logical, should a text progress bar showing overall progress be shown
and updated after each successful bootstrapped model fit? Defaults to  | 
| use.MLEs | logical, should the MLEs of the  | 
| control | a list of control arguments, with same possible arguments as shown in 
 | 
Value
A list of class boot.joint which contains the MLEs from supplied joint
object, as well as the bootstrapped summaries and some model/computation information.
Author(s)
James Murray (j.murray7@ncl.ac.uk).
See Also
Examples
# Bivariate fit on PBC data -----------------------------------------
data(PBC)
# Subset data and remove NAs
PBC <- subset(PBC, select = c('id', 'survtime', 'status', 'drug', 'time',
                              'albumin', 'platelets'))
PBC <- na.omit(PBC) 
# Specify bivariate fit
long.formulas <- list(
  albumin ~ time*drug + (1 + time|id),
  platelets ~ time * drug + (1 + time|id)
)
surv.formula <- Surv(survtime, status) ~ drug
fit <- joint(long.formulas, surv.formula, PBC, family = list('gaussian', 'poisson'))
# Set 50 bootstraps, with lower absolute tolerance and convergence of 'either'.
BOOT <- boot.joint(fit, PBC, nboot = 50L, control = list(tol.abs = 5e-3, conv = 'either'),
                   use.MLEs = TRUE)
BOOT # Print to console via S3 method
Obtain conditional distribution of the random effects
Description
Obtain the conditional distribution of the random effects of a joint model
fit. This is achieved by a Metropolis scheme. Approximate normality across random effects is
expected, and could be useful in diagnosing potential issues surrounding model fits.
Usage
cond.ranefs(fit, burnin = 500L, N = 3500L, tune = 2)
Arguments
| fit | a joint model fit by the  | 
| burnin | Number of burn-in iterations to discard, defaults to 500. | 
| N | Number of MC iterations to carry out post burn-in, defaults to 3500. | 
| tune | Tuning parameter, problem-specific, defaults to 2. | 
Value
A list of class cond.b.joint containing: 
- walks
- A list of length - ncontaining the history of- b_ipost burn-in.
- acceptance
- A numeric vector containing the acceptance rate for each sampled subject. 
- M
- The ModelInfo list from - joint. Used by S3 methods for class- cond.b.joint.
- bhats
- Posterior estimates at MLEs for the random effects. Same as - ranef(joint).
- Sigmahats
- The covariances of - bhats.
- D
- The MLE estimate for the variance-covariance matrix of random effects from - fit.
- q
- Dimension of random effects. 
- K
- Number of responses. 
- qnames
- The names of the random effects as determined by call to - joint.
- burnin
- The amount of burn-in used. 
- N
- Number of MC iterations. 
- tune
- tuning parameter used 
- nobs
- The number of observations for each subject for each response. 
- elapsed.time
- Time taken for - cond.ranefsto complete.
See Also
Examples
dat <- simData()$data
long.formulas <- list(Y.1 ~ time + cont + bin + (1 + time|id), 
                      Y.2 ~ time + cont + bin + (1 + time|id))
surv.formula <- Surv(survtime, status) ~ bin
fit <- joint(long.formulas, surv.formula, dat, list("gaussian","gaussian"))
cond.b <- cond.ranefs(fit, burnin = 50L, N = 1000, tune = 2)
cond.b
plot(cond.b) # Overall 
plot(cond.b, id = 1) # Plot the first subject (see plot.cond.b.joint).
Dynamic predictions for survival sub-model in a multivariate joint model.
Description
Calculates individualised conditional survival probabilities for subjects
during a period of follow-up using a joint model fit along with requisite longitudinal 
process history. 
Note that this function is largely designed for use within the ROC function which assesses discriminatory power of the joint model, however it does function by itself with proper use of its arguments.
Usage
dynPred(
  data,
  id,
  fit,
  u = NULL,
  nsim = 200,
  progress = TRUE,
  scale = NULL,
  df = NULL
)
Arguments
| data | the data to which the original  | 
| id | subject identifier, i.e. for which subject is the conditional survival probabilities desired? | 
| fit | a joint model fit by the   | 
| u | a numeric  | 
| nsim | how many Monte Carlo simulations should be carried out? Defaults to 
 | 
| progress | a logical, if  | 
| scale | numeric scales the variance-covariance parameter in the proposal distribution for 
the Metropolis-Hastings algorithm. Defaults to  | 
| df | numeric denotes the degrees of freedom of the proposed  | 
Details
Dynamic predictions for the time-to-event process based on information available
on the subject's longitudinal process up to given time t are calculated by Monte Carlo
simulation outlined in Rizopoulos (2011). For a subject last observed at time t, the 
probability that they survive until future time u is
  Pr(T_i \ge u | T \ge t; \boldsymbol{Y}_i, \boldsymbol{b}_i; \boldsymbol{\Omega}) \approx
  \frac{S(u|\hat{\boldsymbol{b}}_i; \boldsymbol{\Omega})}
  {S(t|\hat{\boldsymbol{b}}_i; \boldsymbol{\Omega})}
where T_i is the true failure time for subject i, \boldsymbol{Y}_i their
longitudinal measurements up to time t, and S() the survival function.
\boldsymbol{\Omega} is drawn from the multivariate normal distribution with mean
\hat{\boldsymbol{\Omega}} and its variance taken from a fitted joint object.
\hat{\boldsymbol{b}} is drawn from the t distribution by means of a
Metropolis-Hastings algorithm with nsim iterations.
Value
A list of class dynPred which consists of three items: 
- pi
- A - data.framewhich contains each candidate failure time (supplied by- u), with the mean, median and 2.5% and 97.5% quantiles of probability of survival until this failure time.
- pi.raw
- A - matrixof with- nsimrows and- length(u)columns, each row represents the- lth conditional survival probability of survival each- usurvival time. This is largely for debugging purposes.
- MH.accept
- The acceptance rate of the Metropolis-Hastings algorithm on the random effects. 
Author(s)
James Murray (j.murray7@ncl.ac.uk).
References
Bernhardt PW, Zhang D and Wang HJ. A fast EM Algorithm for Fitting Joint Models of a Binary Response to Multiple Longitudinal Covariates Subject to Detection Limits. Computational Statistics and Data Analysis 2015; 85; 37–53
Rizopoulos D. Dynamic predictions and prospective accuracy in joint models for longitudinal and time-to-event data. Biometrics 2011; 67: 819–829.
See Also
ROC and plot.dynPred.
Examples
data(PBC)
PBC$serBilir <- log(PBC$serBilir)
# Focus in on id 81, who fails at around 7 years of follow-up. \code{dynPred} allows us to
# infer how the model believes their survival probability would've progressed (ignoring the
# true outcome at start time).
# Univariate -----------------------------------------------------------
long.formulas <- list(serBilir ~ drug * time + (1 + time|id))
surv.formula <- Surv(survtime, status) ~ drug
family <- list('gaussian')
fit <- joint(long.formulas, surv.formula, PBC, family)
preds <- dynPred(PBC, id = 81, fit = fit, u = NULL, nsim = 200,
                 scale = 2)
preds
plot(preds)
# Bivariate ------------------------------------------------------------
# Does introduction of albumin affect conditional survival probability?
long.formulas <- list(
  serBilir ~ drug * time + I(time^2) + (1 + time + I(time^2)|id),
  albumin ~ drug * time + (1 + time|id)
)
fit <- joint(long.formulas, surv.formula, data = PBC, family = list("gaussian", "gaussian"))
bi.preds <- dynPred(PBC, id = 81, fit = fit, u = NULL, nsim = 200, 
                    scale = fit$coeffs$D/sqrt(fit$ModelInfo$n))
bi.preds
plot(bi.preds) # Appears to level-off dramatically; perhaps indicative of this id's albumin
               # levels, or acceleration in serBilir trajectory around 8.5 years.
Extract AIC from a joint model fit.
Description
Extract AIC from a joint model fit.
Usage
## S3 method for class 'joint'
extractAIC(fit, scale, k = 2, conditional = FALSE, ...)
Arguments
| fit | A fitted  | 
| scale | See  | 
| k | Numeric specifying the "weight" of degrees of freedom (default  | 
| conditional | Should AIC of conditional or observed log-likelihood be used? Defaults
to  | 
| ... | additional arguments (none used). | 
Value
A numeric vector of length 2, with first and second element giving
- df
- The degrees of freedom for the fitted model. 
- AIC
- The Akaike Information Criterion for the fitted model. 
Obtain joint model fitted values
Description
returns the fitted values from a joint object. Note that the 
linear predictor for each k=1,\dots,K response is returned.
Usage
## S3 method for class 'joint'
fitted(object, as = "matrix", ...)
Arguments
| object | a joint model fit by the  | 
| as | should the fitted values be returned as a  | 
| ... | Additional arguments (none used). | 
Value
A matrix (or list) with a column (or list entry) for each of the fitted linear
predictors with class fitted.joint.
Author(s)
James Murray (j.murray7@ncl.ac.uk).
See Also
Examples
# Bivariate fit on PBC data -----------------------------------------
data(PBC)
# Subset data and remove NAs
PBC <- subset(PBC, select = c('id', 'survtime', 'status', 'drug', 'time',
                              'albumin', 'platelets'))
PBC <- na.omit(PBC) 
# Specify bivariate fit
long.formulas <- list(
  albumin ~ time*drug + (1 + time|id),
  platelets ~ time * drug + (1 + time|id)
)
surv.formula <- Surv(survtime, status) ~ drug
fit <- joint(long.formulas, surv.formula, PBC, family = list('gaussian', 'poisson'))
fitted(fit)
Extract fixed effects from a joint object.
Description
Extract fixed effects from a joint object.
Usage
## S3 method for class 'joint'
fixef(object, what = c("long", "surv"), ...)
Arguments
| object | a joint model fit by the  | 
| what | character string. Should the  | 
| ... | additional arguments (none used). | 
Value
A vector containing requested fixed effects.
Author(s)
James Murray (j.murray7@ncl.ac.uk).
See Also
Examples
# Univariate fit on PBC data -------------------------------------------
data(PBC)
# Subset data and remove NAs
PBC <- subset(PBC, select = c('id', 'survtime', 'status', 'drug', 'time',
                              'albumin'))
PBC <- na.omit(PBC) 
# Specify simple univariate fit
long.formulas <- list(
  albumin ~ time + (1 + time|id)
)
surv.formula <- Surv(survtime, status) ~ drug
fit <- joint(long.formulas, surv.formula, PBC, family = list('gaussian'))
fixef(fit, 'long')
fixef(fit, 'surv')
Fit a joint model to time-to-event and multivariate longitudinal data
Description
Fit a joint model to time-to-event and multivariate longitudinal data
Usage
joint(
  long.formulas,
  surv.formula,
  data,
  family,
  disp.formulas = NULL,
  control = list()
)
Arguments
| long.formulas | A list of formula objects specifying the  | 
| surv.formula | A formula specifying the time-to-event sub-model. Must be usable by 
 | 
| data | A  | 
| family | A list of length  | 
| disp.formulas | An optional list of length  | 
| control | A list of control values: 
 | 
Details
Function joint fits a joint model to time-to-event data and multivariate 
longitudinal data. The longitudinal data can be specified by numerous models encompassing
a fairly wide range of data. This joint model fit is achieved by the use of an approximate
EM algorithm first proposed in Bernhardt et al. (2015), and later used in the 'classic' 
multivariate joint model in Murray and Philipson (2022). Each longitudinal response is 
modelled by 
h_k(E[Y_{ik}|b_{ik};\Omega]) = X_{ik}\beta_k + Z_{ik}b_{ik}
where h_k is a known, monotonic link function. An association is induced between the 
Kth response and the hazard \lambda_i(t) by: 
\lambda_i(t)=\lambda_0(t)\exp\{S_i^T\zeta + \sum_{k=1}^K\gamma_kW_k(t)^Tb_{ik}\}
where \gamma_k is the association parameter and W_k(t) is the vector function of 
time imposed on the Kth random effects structure (i.e. intercept-and-slope; spline).
Value
An object with class joint. See joint.object for information.
Family specification
Currently, five families are available for implementation, spanning continuous, binary and count data types:
- 'gaussian'
- Normally distributed. The identity link is used. A term - \sigma_kwill be estimated, denoting the variance of this response
- 'binomial'
- For binary data types, a logit link is used. 
- 'poisson'
- For count data types where dispersion is either non-consequential or ignored. A log link is used. 
- 'genpois'
- For count data types where dispersion is at least of some secondary interest. A log link is used. A term - \sigma_kis estimated, denoting the dispersion,- \varphiof the response. This follows interpretation of Zamani & Ismail (2012):- \varphi>0: Over-dispersion;- \varphi<0: Under-dispersion.- Var[Y]=(1+\varphi)^2\mu.
- 'Gamma'
- For continuous data where a Gamma distribution might be sensible. The log link is used. A term - \sigma_kis be estimated, denoting the (log) shape of the distribution, which is then reported as- \varphi_k=\exp\{\sigma_k\}.
- "negbin"
- For count data types where overdispersion is modelled. A log link is used. A term - \sigma_kis estimated, which is then reported as- \varphi_k=\exp\{\sigma_k\}which is the overdispersion. The variance of the response is- Var[Y]=\mu+\mu^2/\varphi.
For families "negbin", "Gamma", "genpois", the user can define the 
dispersion model desired in disp.formulas. For the "negbin" and "Gamma"
cases, we define \varphi_i=\exp\{W_i\sigma_i\} (i.e. the exponent of the linear 
predictor of the dispersion model; and for "genpois" the identity of the linear
is used.
Dispersion models
The disp.formulas in the function call allows the user to model the dispersion for
a given sub-model if wanted. The default value disp.formulas = NULL simply imposes
an 'intercept only' model. If the kth item in disp.formulas corresponds to 
a longitudinal sub-model with no dispersion term, then it is simply ignored. With this in 
mind then, if a dispersion model is only required for, say, one sub-model, then the 
corresponding item in this list of models should be specified as such, with the others set to
~1.
Standard error estimation
We follow the approximation of the observed empirical information matrix detailed by 
Mclachlan and Krishnan (2008), and later used in joineRML (Hickey et al., 2018).
These are only calculated if post.process=TRUE. Generally, these SEs are well-behaved,
but their reliability will depend on multiple factors: Sample size; number of events; 
collinearity of REs of responses; number of observed times, and so on. Some more discussion/
references are given in vcov.joint.
Convergence of the algorithm
A few convergence criteria (specified by control$conv) are available: 
- abs
- Convergence reached when maximum absolute change in parameter estimates is - <tol.abs.
- rel
- Convergence reached when maximum absolute relative change in parameter estimates is - <tol.rel. A small amount (- tol.den) is added to the denominator to eschew numerical issues if parameters are nearly zero.
- either
- Convergence is reached when either - absor- relare met.
- sas
- Assess convergence for parameters - |\Omega_a|- <tol.thrby the- abscriterion, else- rel. This is the default.
Note that the baseline hazard is updated at each EM iteration, but is not monitored for convergence.
Author(s)
James Murray (j.murray7@ncl.ac.uk).
References
Bernhardt PW, Zhang D and Wang HJ. A fast EM Algorithm for Fitting Joint Models of a Binary Response to Multiple Longitudinal Covariates Subject to Detection Limits. Computational Statistics and Data Analysis 2015; 85; 37–53
Hickey GL, Philipson P, Jorgensen A, Kolamunnage-Dona R. joineRML: a joint model and
software package for time-to-event and multivariate longitudinal outcomes.
BMC Med. Res. Methodol. 2018; 50
McLachlan GJ, Krishnan T. The EM Algorithm and Extensions. Second Edition. Wiley-Interscience; 2008.
Murray, J and Philipson P. A fast approximate EM algorithm for joint models of survival and multivariate longitudinal data.Computational Statistics and Data Analysis 2022; 170; 107438
Zamani H and Ismail N. Functional Form for the Generalized Poisson Regression Model, Communications in Statistics - Theory and Methods 2012; 41(20); 3666-3675.
See Also
summary.joint, logLik.joint, boot.joint,
extractAIC.joint, fixef.joint, ranef.joint,
vcov.joint, joint.object and xtable.joint. For
data simulation see simData.
Examples
# 1) Fit on simulated bivariate data, (1x gaussian, 1x poisson) --------
beta <- do.call(rbind, replicate(2, c(2, -0.1, 0.1, -0.2), simplify = FALSE))
gamma <- c(0.3, -0.3)
D <- diag(c(0.25, 0.09, 0.25, 0.05))
family <- list('gaussian', 'poisson')
data <- simData(ntms = 10, beta = beta, D = D, n = 100,
                family = family, zeta = c(0, -0.2),
                sigma = list(0.16, 0), gamma = gamma)$data
# Specify formulae and target families
long.formulas <- list(
  Y.1 ~ time + cont + bin + (1 + time|id),  # Gaussian
  Y.2 ~ time + cont + bin + (1 + time|id)   # Poisson
)
surv.formula <- Surv(survtime, status) ~ bin
fit <- joint(long.formulas, surv.formula, data, family)
# 2) Fit on PBC data -----------------------------------------------------
data(PBC)
# Subset data and remove NAs
PBC <- subset(PBC, select = c('id', 'survtime', 'status', 'drug', 'time',
                              'serBilir', 'albumin', 'spiders', 'platelets'))
PBC <- na.omit(PBC) 
# Specify GLMM sub-models, including interaction and quadratic time terms
long.formulas <- list(
  log(serBilir) ~ drug * (time + I(time^2)) + (1 + time + I(time^2)|id),
  albumin ~ drug * time + (1 + time|id),
  platelets ~ drug * time + (1 + time|id),
  spiders ~ drug * time + (1|id)
)
surv.formula <- Surv(survtime, status) ~ drug
fit <-  joint(long.formulas, surv.formula, PBC, 
              family = list("gaussian", "gaussian", "poisson", "binomial"),
              control = list(verbose = TRUE))
fit
# 3) Fit with dispersion models ----------------------------------------
beta <- do.call(rbind, replicate(2, c(2, -0.1, 0.1, -0.2), simplify = FALSE))
gamma <- c(0.3, -0.3)
D <- diag(c(0.25, 0.09, 0.25, 0.05))
family <- list('negbin', 'poisson')   # As an example; only requires one dispersion model.
sigma <- list(c(1, 0.2), 0)           # Need to specify the model in simData call too.
disp.formulas = list(~time, ~1)       # Even though poisson doesn't model dispersion, need to
                                      # populate this element in disp.formulas!
# Simulate some data
data <- simData(ntms = 10, beta = beta, D = D, n = 500,
                family = family, zeta = c(0, -0.2), sigma = sigma,
                disp.formulas = disp.formulas, gamma = gamma)$data
# Now fit using joint
long.formulas <- list(
  Y.1 ~ time + cont + bin + (1+time|id),
  Y.2 ~ time + cont + bin + (1+time|id)
)
fit <- joint(
  long.formulas, Surv(survtime, status) ~ bin,
  data, family, disp.formulas = disp.formulas
)
fit
summary(fit)
Fitted joint object
Description
An object returned by the joint function, with class joint
a fitted joint model. Objects of this class currently have methods for: logLik,
print, ranef, fixef, summary, AIC, and vcov.
Usage
joint.object
Format
An object of class NULL of length 0.
Value
A list with the following components.
- coeffs
- A list containing parameter estimates: - D
- The variance-covariance matrix of the random effects. 
- beta
- Vector of fixed effects for longitudinal processes. 
- sigma
- List of dispersion parameters, families with no dispersion parameter are returned as an unnamed zero value. 
- gamma
- Vector of association parameters. 
- zeta
- Vector of time-invariant survival coefficients. 
 
- hazard
- A matrix containing unique failure times - ft, their hazard contribution- hazand the number of events at that failure time- nev.
- ModelInfo
- A list containing information on the model fit: - ResponseInfo
- A vector containing response names with (family) reported. 
- Resps
- A vector containing response names only. 
- family
- A list of families fit. 
- K
- An integer specifying the number of longitudinal sub-models. 
- Pcounts
- A list containing informations about the number of parameters/random effects: - P
- A vector of length K containing the number of fixed effects for each response (in order). 
- Pd
- A vector of length K containing the number of dispersion parameters for each response (in order) 0 denotes no parameter for that response. 
- q
- An integer denoting the number of random effects. 
- vD
- An integer denoting the number of unique variance-covariance parameters estimated. 
 
- long.formulas
- A list of - long.formulas(i.e. from- jointcall).
- disp.formulas
- A list of - disp.formulas(i.e. from- jointcall). If no- disp.formulasare supplied to- joint, then this is populated by a list of- K"- ~1". The environment is set to- parent.framein this case to avoid memory overheads in returned objects.
- surv.formula
- Formula object from - jointcall.
- survtime
- The name of the event time used in - surv.formula.
- status
- The name of the event indicator used in - surv.formula.
- control
- List of control parameters used, see - joint.
- convergence.criteria
- List of parameters relating to the stopping rule. 
- inds
- A list of length two, named - Rand- Cpp, each of which contains the indices for fixed effects- \betafor each response, or the random effects- bfor the named platform.
- n
- Number of subjects. 
- nobs
- A vector containing total number of observations for each response. 
- mi
- A - Kx- nmatrix containing the number of observations for subject- ifor the- kth response.
- nev
- Number of events. 
- id.assign
- A list containing the original ids of subjects in the - datasupplied to- joint, and the id assigned to them for use in subsequent functions.
 
- Hessian
- The (approximated) Hessian found at MLEs. Only returned if control argument - post.process=TRUE
.
- vcov
- The full variance-covariance matrix between parameters. Only returned if control argument - post.process=TRUE.
- SE
- A named vector of approximated standard error for each estimated parameter. Only returned if control argument - post.process=TRUE.
- logLik
- log-likelihood evaluated at parameter estimates. Only returned if control argument - post.process=TRUE.
- REs
- The random effects, with subject-specific variance matrices attributed. If control argumnet - post.process=TRUEthen these are found at MLEs (i.e. are posterior estimates), otherwise they are taken from the final EM iteration.
- elapsed.time
- Named numeric containing breakdown of elapsed time for - jointfit.
- dmats
- A list of data matrices on each of the longitudinal and survival processes for each subject. 
Author(s)
James Murray (j.murray7@ncl.ac.uk).
See Also
Log-likelihood for joint model.
Description
Calculate joint log-likelihood, degrees of freedom, AIC and BIC of joint model fit.
Usage
## S3 method for class 'joint'
logLik(object, conditional = FALSE, ...)
Arguments
| object | a  | 
| conditional | Logical. Should the conditional or observed data log-likelihood be returned? See details. | 
| ... | additional arguments (none used). | 
Details
Calculate the log-likelihood of a joint model of survival and multivariate longitudinal
data (i.e. a joint object). The argument conditional manages whether
or not the log-likelihood conditional on the random effects, or simply
the observed data log-likelihood is returned (the default, conditional = FALSE). 
If conditional = TRUE, then the log-likelihood conditional on the random 
effects is returned. That is
\log f(T_i, \Delta_i, Y_i|b_i;\Omega) = 
      \log f(Y_i|b_i; \Omega) + \log f(T_i, \Delta_i|b_i; \Omega) + \log f(b_i|\Omega)
If conditional = FALSE, then the observed data log-likelihood is returned i.e.
\log\int f(Y_i|b_i; \Omega)f(T_i, \Delta_i|b_i; \Omega)f(b_i|\Omega)db_i.
Additionally, the degrees of freedom, \nu is given by
\nu = \code{length(vech(D))} + \sum_{k=1}^K\{P_k + P_{\sigma_k}\} + P_s,
where P_k denotes the number of coefficients estimated for the kth response,
and P_{\sigma_k} the number of dispersion parameters estimated. P_s denotes
the number of survival coefficients, i.e. the length of c(zeta, gamma). Finally,
all covariance parameters are captured in length(vech(D)). 
With the degrees of freedom, we can additionally compute AIC and BIC, which are defined in no special way; and are calculated using the observed data log-likelihood.
Value
Returns an object of class logLik, a number which is the log-likelihood
of the fitted model object. This has multiple attributes: df which is the 
degrees of freedom, df.residual; the number of residual degrees of freedom;
AIC and BIC which are the Akaike or Bayes information criterion evaluated at 
either the conditional or observed log-likelihood (as requested by argument 
conditional).
Author(s)
James Murray (j.murray7@ncl.ac.uk)
References
Henderson R, Diggle P, Dobson A. Joint modelling of longitudinal measurements and event time data. Biostatistics 2000; 1(4); 465-480.
Wulfsohn MS, Tsiatis AA. A joint model for survival and longitudinal data measured with error. Biometrics 1997; 53(1); 330-339.
See Also
extractAIC.joint and anova.joint
Examples
# Bivariate simulated data (2x Gaussian)
data <- simData(n = 100,
   D = diag(c(.25, .04, .2, .02)),
   gamma = c(0.4, -0.2), theta = c(-2, .2))$data
fit <- joint(list(
    Y.1 ~ time + cont + bin + (1 + time|id),
    Y.2 ~ time + cont + bin + (1 + time|id)
  ), Surv(survtime, status) ~ cont + bin, 
  data = data, 
  family = list('gaussian', 'gaussian'))
logLik(fit)
Parsing the survival formula and constructing all survival-related data objects.
Description
Creates a set of survival data and fits a coxph model
using a survival formula and a data set.
Usage
parseCoxph(surv.formula, data, center = TRUE)
Arguments
| surv.formula | A formula readable by 'coxph'. | 
| data | a set of data containing covariate information for variables named by ‘surv.formula'. Can be of any ’completeness', as the function returns a reduced set. | 
| center | Should the covariate matrices be mean-centered before being returned?
defaults to  | 
Value
A list with class parseCoxph containing: 
- survdata
- reduced version of - data, with only one row per subject, with covariates specified by- surv.formulaalong with survival time and failure status.
- Smat
- matrix containing all requisite survival covariates (one row per subject). 
- ph
- the model fit from - coxph.
- Delta
- list of failure indicators for each of the unique subjects. 
- n
- number of unique subjects. 
- ft
- vector of unique failure times. 
- nev
- vector containing number of failures at each failure time - ft.
- survtime
- the name of the - timevariable in- surv.formula.
- status
- the name of the - eventvariable in- surv.formula.
Examples
data = simData()$data
parseCoxph(Surv(survtime, status) ~ bin, data = data)
Plot receiver operator characteristics.
Description
Produces a simple plot showing the true positive rate (sensitivity) against
the false positive rate (1-specificy) for a dynamic prediction routine on a joint model
along a specified time interval.
Usage
## S3 method for class 'ROC.joint'
plot(x, legend = TRUE, show.Youden = TRUE, show.F1 = FALSE, ...)
Arguments
| x | an object with class  | 
| legend | should a legend displaying the number in risk set; number of failures in interval;
area under the ROC curve and median Brier score be added to the bottom-right corner of the ROC 
plot? Default is  | 
| show.Youden | should a line be drawn showing optimal cut-point using Youden's J statistic?
Defaults to  | 
| show.F1 | should a line be drawn showing optimal cut-point using the F-score?
Defaults to  | 
| ... | additional arguments (none used). | 
Author(s)
James Murray (j.murray7@ncl.ac.uk).
See Also
Plot posterior distribution of the random effects for a joint model.
Description
Plot posterior distribution of the random effects for a joint model.
Usage
## S3 method for class 'cond.b.joint'
plot(
  x,
  id = NULL,
  show.cov = TRUE,
  nrow = NULL,
  ncol = NULL,
  title = NULL,
  ...
)
Arguments
| x | object of class  | 
| id | integer, if specified this shows the plots the posterior density for the chosen
id (as they appeared in original  | 
| show.cov | logical, should the 'true' (normal) density be overlaid on these plots? If 
 | 
| nrow | integer specifying the number of rows to use in paneled plot. | 
| ncol | integer specifying the number of columns to use in paneled plot. Note that 
both  | 
| title | optional character string to specify the title. This is placed at the top middle of the graphics device. | 
| ... | Additional arguments, these are passed to the plotting of the individual densities. | 
Examples
# Bivariate Gaussian 
dat <- simData(n = 100)$data
long.formulas <- list(Y.1 ~ time + cont + bin + (1 + time|id), 
                      Y.2 ~ time + cont + bin + (1 + time|id))
surv.formula <- Surv(survtime, status) ~ bin
fit <- joint(long.formulas, surv.formula, dat, list("gaussian","gaussian"))
cond.b <- cond.ranefs(fit, burnin = 500L, N = 4500L)
# Posterior for entire sample with dummy title
plot(cond.b, title = "Example title")
# Posterior for a randomly selected id
dummy.id <- sample(1:100, 1)
# Should show good agreement between true posterior and approximate normal.
plot(cond.b, id = dummy.id, title = paste0("id: ", dummy.id))
Plot conditional survival probabilities.
Description
Produces a simple plot of the probability the subject survives given failure times along with the 95% confidence interval if probabilities are generated from a simulation scheme, and simply the first-order estimate otherwise.
Usage
## S3 method for class 'dynPred'
plot(x, what = c("median", "mean"), ...)
Arguments
| x | an object with class  | 
| what | should the  | 
| ... | additional arguments (none used). | 
Author(s)
James Murray (j.murray7@ncl.ac.uk).
See Also
Plot joint model residuals
Description
Plot residuals obtained by a joint model (obtained by joint). 
If the residuals.joint object represents the longitudinal process, a simple (paneled)
plot is produced (one for each response). If the residual object contains the Cox-Snell 
residuals then several plots are produced (interactively): The KM estimate of survival 
function of said residuals and then repeated for each survival covariate in the model call
to joint (if requested).
Usage
## S3 method for class 'residuals.joint'
plot(x, strata = FALSE, ...)
Arguments
| x | an object with class  | 
| strata | logical, should strata (for the survival sub-model only). Defaults to 
 | 
| ... | additional arguments (none used). | 
Author(s)
James Murray (j.murray7@ncl.ac.uk).
See Also
Extract random effects from a joint object.
Description
Return the random effects \hat{\boldsymbol{b}} which maximises the complete
data log-likelihood at the MLEs \hat{\Omega}.
Usage
## S3 method for class 'joint'
ranef(object, Var = FALSE, ...)
Arguments
| object | a joint model fit by the  | 
| Var | logical, should the estimated variance of the random effects at  | 
| ... | additional arguments (none used). | 
Value
A matrix containing required random effects effects. If Var=TRUE,
instead a list is returned with first element the matrix of random effects and second a 
matrix of the variances \hat{\Sigma}. Note that these are posterior modes
of the random effects. Conditional distribution can be found by cond.ranefs.
Author(s)
James Murray (j.murray7@ncl.ac.uk).
See Also
Examples
# Univariate fit on PBC data -----------------------------------------
data(PBC)
# Subset data and remove NAs
PBC <- subset(PBC, select = c('id', 'survtime', 'status', 'drug', 'time',
                              'albumin'))
PBC <- na.omit(PBC) 
# Specify univariate fit
long.formulas <- list(
  albumin ~ time*drug + (1 + time|id)
)
surv.formula <- Surv(survtime, status) ~ drug
fit <- joint(long.formulas, surv.formula, PBC, family = list('gaussian'))
b <- ranef(fit, FALSE)
Obtain joint model residuals
Description
returns the Pearson residuals values from a joint object.
Usage
## S3 method for class 'joint'
residuals(
  object,
  what = c("longit", "surv"),
  type = c("response", "pearson"),
  ...
)
Arguments
| object | a joint model fit by  | 
| what | character string. Should the  | 
| type | character. The residual type for  | 
| ... | Additional arguments (none used). | 
Value
a named list of length K of class residuals.joint containing
residuals produced by the joint model for each of the k=1,\dots,K responses, 
along with the fitted values as an attribute.
Author(s)
James Murray (j.murray7@ncl.ac.uk).
See Also
fitted.joint plot.residuals.joint
Examples
# Trivariate fit on PBC data -----------------------------------------
data(PBC)
# Subset data and remove NAs
PBC <- subset(PBC, select = c('id', 'survtime', 'status', 'drug', 'time',
                              'albumin', 'hepatomegaly', 'platelets'))
PBC <- na.omit(PBC) 
# Specify trivariate fit
long.formulas <- list(
  albumin ~ time*drug + (1 + time|id),
  platelets ~ time * drug + (1 + time|id),
  hepatomegaly ~ time * drug + (1|id)
)
surv.formula <- Surv(survtime, status) ~ drug
fit <- joint(long.formulas, surv.formula, PBC, 
             family = list('gaussian', 'poisson', 'binomial'))
R <- residuals(fit, type = 'pearson')
plot(R)
plot(residuals(fit, what = "surv"))
Simulate realisations from a generalised poisson distribution
Description
Simulate realisations from a generalised poisson distribution
Usage
rgenpois(mu, phi)
Arguments
| mu | A numeric vector of rates  | 
| phi | A numeric specifying the dispersion  | 
Details
Follows the "GP-1" implementation of the generalised Poisson distribution outlined 
in Zamani & Ismail (2012). The variance of produced Y is (1+\varphi)^2\mu. As such
the dispersion parameter is bounded (i.e. not in positive reals as with CMP distribution).
Value
An appropriately-dimensioned vector of count data.
References
Zamani H and Ismail N. Functional Form for the Generalized Poisson Regression Model, Communications in Statistics - Theory and Methods 2012; 41(20); 3666-3675.
Simulate data from a multivariate joint model
Description
Simulate multivariate longitudinal and survival data from a joint model 
specification, with potential mixture of response families. Implementation is similar 
to existing packages (e.g. joineR, joineRML).
Usage
simData(
  n = 250,
  ntms = 10,
  fup = 5,
  family = list("gaussian", "gaussian"),
  sigma = list(0.16, 0.16),
  beta = rbind(c(1, 0.1, 0.33, -0.5), c(1, 0.1, 0.33, -0.5)),
  D = NULL,
  gamma = c(0.5, -0.5),
  zeta = c(0.05, -0.3),
  theta = c(-4, 0.2),
  cens.rate = exp(-3.5),
  regular.times = TRUE,
  dof = Inf,
  random.formulas = NULL,
  disp.formulas = NULL,
  return.ranefs = FALSE
)
Arguments
| n | the number of subjects | 
| ntms | the number of time points | 
| fup | the maximum follow-up time, such that t = [0, ..., fup] with length  | 
| family | a  | 
| sigma | a  | 
| beta | a  | 
| D | a positive-definite matrix specifying the variance-covariance matrix for the random effects. If not supplied an identity matrix is assumed. | 
| gamma | a  | 
| zeta | a vector of length 2 specifying the coefficients for the baseline covariates in the survival sub-model, in the order of continuous and binary. | 
| theta | parameters to control the failure rate, see baseline hazard. | 
| cens.rate | parameter for  | 
| regular.times | logical, if  | 
| dof | integer, specifies the degrees of freedom of the multivariate t-distribution
used to generate the random effects. If specified, this t-distribution is used. If left
at the default  | 
| random.formulas | allows user to specify if an intercept-and-slope ( | 
| disp.formulas | allows user to specify the dispersion model simulated. Intended use is
to allow swapping between intercept only (the default) and a time-varying one ( | 
| return.ranefs | a logical determining whether the true random effects should be 
returned. This is largely for internal/simulation use. Default  | 
Details
simData simulates data from a multivariate joint model with a mixture of 
families for each k=1,\dots,K response. The specification of family changes
requisite dispersion parameter sigma, if applicable. The family list can
(currently) contain: 
- "gaussian"
- Simulated with identity link, corresponding item in - sigmawill be the variance.
- "poisson"
- Simulated with log link, corresponding dispersion in - sigmacan be anything, as it doesn't impact simulation.
- "binomial"
- Simulated with logit link, corresponding dispersion in - sigmacan be anything, as it doesn't impact simulation.
- "negbin"
- Simulated with a log link, corresponding item in - sigmawill be the overdispersion defined on the log scale. Simulated variance is- \mu+\mu^2/\varphi.
- "genpois"
- Simulated with a log link, corresponding item in - sigmawill be the dispersion. Values < 0 correspond to under-dispersion, and values > 0 over- dispersion. See- rgenpoisfor more information. Simulated variance is- (1+\varphi)^2\mu.
- "Gamma"
- Simulated with a log link, corresponding item in - sigmawill be the shape parameter, defined on the log-scale.
Therefore, for families "negbin", "Gamma", "genpois", the user can
define the dispersion model desired in disp.formulas, which creates a data matrix 
W. For the "negbin" and "Gamma" cases, we define 
\varphi_i=\exp\{W_i\sigma_i\} (i.e. the exponent of the linear predictor of the 
dispersion model); and for "genpois" the identity of the linear is used.
Value
A list of two data.frames: One with the full longitudinal data, and another 
with only survival data. If return.ranefs=TRUE, a matrix of the true b values is
also returned. By default (i.e. no arguments provided), a bivariate Gaussian set of joint
data is returned.
Baseline hazard
When simulating the survival time, the baseline hazard is a Gompertz distribution controlled 
by theta=c(x,y):
\lambda_0(t) = \exp{x + yt}
where y is the shape parameter, and the scale parameter is \exp{x}.
Author(s)
James Murray (j.murray7@ncl.ac.uk).
References
Austin PC. Generating survival times to simulate Cox proportional hazards models with time-varying covariates. Stat Med. 2012; 31(29): 3946-3958.
See Also
Examples
# 1) A set of univariate data ------------------------------------------
beta <- c(2.0, 0.33, -0.25, 0.15)
# Note that by default arguments are bivariate, so need to specify the univariate case
univ.data <- simData(beta = beta,    
                     gamma = 0.15, sigma = list(0.2), family = list("gaussian"), 
                     D = diag(c(0.25, 0.05)))
                     
# 2) Univariate data, with failure rate controlled ---------------------
# In reality, many facets contribute to the simulated failure rate, in 
# this example, we'll just atler the baseline hazard via 'theta'.
univ.data.highfail <- simData(beta = beta,
                              gamma = 0.15, sigma = list(0.0), family = list("poisson"),
                              D = diag(c(0.40, 0.08)), theta = c(-2, 0.1))
# 3) Trivariate (K = 3) mixture of families with dispersion parameters -
beta <- do.call(rbind, replicate(3, c(2, -0.1, 0.1, -0.2), simplify = FALSE))
gamma <- c(0.3, -0.3, 0.3)
D <- diag(c(0.25, 0.09, 0.25, 0.05, 0.25, 0.09))
family <- list('gaussian', 'genpois', 'negbin')
sigma <- list(.16, 1.5, log(1.5))
triv.data <- simData(ntms=15, family = family, sigma = sigma, beta = beta, D = D, 
                     gamma = gamma, theta = c(-3, 0.2), zeta = c(0,-.2))
# 4) K = 4 mixture of families with/out dispersion ---------------------
beta <- do.call(rbind, replicate(4, c(2, -0.1, 0.1, -0.2), simplify = FALSE))
gamma <- c(-0.75, 0.3, -0.6, 0.5)
D <- diag(c(0.25, 0.09, 0.25, 0.05, 0.25, 0.09, 0.16, 0.02))
family <- list('gaussian', 'poisson', 'binomial', 'gaussian')
sigma <- list(.16, 0, 0, .05) # 0 can be anything here, as it is ignored internally.
mix.data <- simData(ntms=15, family = family, sigma = sigma, beta = beta, D = D, gamma = gamma,
                    theta = c(-3, 0.2), zeta = c(0,-.2))
                    
# 5) Bivariate joint model with two dispersion models. -----------------
disp.formulas <- list(~time, ~time)          # Two time-varying dispersion models
sigma <- list(c(0.00, -0.10), c(0.10, 0.15)) # specified in form of intercept, slope
D <- diag(c(.25, 0.04, 0.50, 0.10))
disp.data <- simData(family = list("genpois", "negbin"), sigma = sigma, D = D,
                     beta = rbind(c(0, 0.05, -0.15, 0.00), 1 + c(0, 0.25, 0.15, -0.20)),
                     gamma = c(1.5, 1.5),
                     disp.formulas = disp.formulas, fup = 5)            
# 6) Trivariate joint model with mixture of random effects models ------
# It can be hard to e.g. fit a binomial model on an intercept and slope, since e.g.
# glmmTMB might struggle to accurately fit it (singular fits, etc.). To that end, could
# swap the corresponding random effects specification to be an intercept-only.
family <- list("gaussian", "binomial", "gaussian")
# A list of formulae, even though we want to change the second sub-model's specification
# we need to specify the rest of the items, too (same as disp.formulas, sigma).
random.formulas <- list(~time, ~1, ~time)
beta <- rbind(c(2, -0.2, 0.5, -0.25), c(0, 0.5, 1, -1), c(-2, 0.2, -0.5, 0.25))
# NOTE that the specification of RE matrix will need to match.
D <- diag(c(0.25, 0.09, 1, 0.33, 0.05))
# Simulate data, and return REs as a sanity check...
mix.REspec.data <- simData(beta = beta, D = D, family = family,
                           gamma = c(-0.5, 1, 0.5), sigma = list(0.15, 0, 0.15),
                           random.formulas = random.formulas, return.ranefs = TRUE)
Summary of an joint object.
Description
Generate summary of a fitted multivariate joint model.
Usage
## S3 method for class 'joint'
summary(object, ...)
Arguments
| object | a joint model fit by the  | 
| ... | additional arguments (none used). | 
Value
Object of class summary.joint.
Author(s)
James Murray j.murray7@ncl.ac.uk
See Also
joint and joint.object
Examples
# Simple univariate on log(serum bilirubin) ----------------------------
data(PBC)
long.formulas <-  list(
  log(serBilir) ~ drug * (time + I(time^2)) + (1 + time + I(time^2)|id)
)
surv.formula <- Surv(survtime, status) ~ sex + drug
fit <- joint(long.formulas = long.formulas,
             surv.formula = surv.formula,
             data = PBC, family = list("gaussian"))
summary(fit)              
# Bivariate with a dispersion model ------------------------------------
PBC <- na.omit(PBC[,c('id', 'survtime', 'status', 'sex', 
                      'drug', 'platelets', 'albumin', 'time')])
long.formula <- list(
  platelets ~ time * drug + (1 + time|id),
  albumin ~ time * drug + (1 + time|id)
)
surv.formula <- Surv(survtime, status) ~ sex + drug
fit <- joint(long.formula, surv.formula, PBC, 
             family = list("negbin", "gaussian"),
             disp.formula = list(~time, ~1))
summary(fit)
Extract the variance-covariance matrix from a joint fit.
Description
Extract the variance-covariance matrix from a joint fit.
Usage
## S3 method for class 'joint'
vcov(object, corr = FALSE, ...)
Arguments
| object | a joint model fit by the  | 
| corr | should the correlation matrix be returned instead of the variance-covariance? | 
| ... | extra arguments (none used). | 
Details
Uses the observed-empirical approximation of information matrix (Mclachlan & Krishnan, 2008). The standard errors for the baseline hazard are not estimated.
Value
A variance-covariance matrix for the joint model object.
Methodology
Many competing ways exist for obtaining the observed information matrix in an EM algorithm. 
In the context of joint modelling, the observed empirical approximation of the information 
matrix has been used previously (joineRML, Hickey et al. 2018). Elsewhere,
estimation of the observed information in a semi-parametric setting is outlined neatly in
Xu et al. (2014). Here, they advocate for approximation of this information matrix by 
numerical differentiation of the profile Fisher Score vector. We do not consider this 
methodology owing to its computational expense. That is, for each element of \Omega 
which is perturbed by some small amount \tilde{\Omega}^{p}, we must re-calculate
\hat{b}_i and \hat{\Sigma}_i.
Author(s)
James Murray j.murray7@ncl.ac.uk
References
Hickey GL, Philipson P, Jorgensen A, Kolamunnage-Dona R. joineRML: a joint model and
software package for time-to-event and multivariate longitudinal outcomes.
BMC Med. Res. Methodol. 2018; 50
McLachlan GJ, Krishnan T. The EM Algorithm and Extensions. Second Edition. Wiley-Interscience; 2008.
Xu C, Baines PD, Wang J. Standard error estimation using the EM algorithm for the joint modeling of survival and longitudinal data. Biostatistics 2014; 15(4).
Examples
# Univariate fit on PBC data -------------------------------------------
data(PBC)
# Subset data and remove NAs
PBC <- subset(PBC, select = c('id', 'survtime', 'status', 'drug', 'time',
                              'albumin'))
PBC <- na.omit(PBC) 
# Specify univariate fit
long.formulas <- list(
  albumin ~ time + (1 + time|id)
)
surv.formula <- Surv(survtime, status) ~ drug
fit <- joint(long.formulas, surv.formula, PBC, family = list('gaussian'))
vcov(fit)
Print an LaTeX-ready xtable for a joint object.
Description
Prints an xtable output for a fitted joint object to the console,
or to a specified save location
Usage
## S3 method for class 'joint'
xtable(
  x,
  caption = NULL,
  label = NULL,
  align = NULL,
  digits = NULL,
  display = NULL,
  auto = FALSE,
  p.val = FALSE,
  max.row = NULL,
  dp = 3,
  vcov = FALSE,
  capture = FALSE,
  capture.location = "",
  hlines = "middle-bottom",
  booktabs = TRUE,
  size = "footnotesize",
  ...
)
Arguments
| x | a joint model fit by the  | 
| caption | character, specifies the  | 
| label | character, specifies the  | 
| align | character, specifies the  | 
| digits | integer, specifies the  | 
| display | character, specifies the  | 
| auto | logical, specifies the  | 
| p.val | logical, should p-values be returned? Defaults to  | 
| max.row | integer, the number of rows after which the table is ‘broken’ vertically
and merged horizontally; useful for long tables. Defaults to  | 
| dp | integer, the number of decimal places to round the estimate, standard error and
confidence intervals to; defaults to  | 
| vcov | logical, should the half-vectorisation of the block diagonal of covariance 
matrix be reported? Default is  | 
| capture | logical, should the printed  | 
| capture.location | character, if  | 
| hlines | character, specifies which horizontal lines are used in the outputted
LaTeX table. Supply a character string which contains  | 
| booktabs | logical, if  | 
| size | character, LaTeX size to be placed before the tabular environment, defaults
to  | 
| ... | additional arguments, none used. | 
Value
A LaTeX-ready xtable print-out of the joint model. A list containing 
constituent tables is also returned invisibly, along with the final xtable output.
Author(s)
James Murray (j.murray7@ncl.ac.uk).
See Also
Examples
# Bivariate joint model ------------------------------------------------
require(xtable)
data <- simData(n = 100)$data
long.formula <- list(
  Y.1 ~ time + cont + bin + (1 + time|id),
  Y.2 ~ time + cont + bin + (1 + time|id)
)
surv.formula <- Surv(survtime, status) ~ cont + bin
family <- list("gaussian", "gaussian")
fit <- joint(long.formula, surv.formula, data, family)
xtable(fit)
# Example of arguments: add dummy caption, add p-values.
xtable(fit, p.val = TRUE, dp = 4, caption = "This is a caption")
# Change size, place horizontal lines everywhere
xtable(fit, size = "normalsize", hlines = c("top-middle-bottom"))
# Make a wider table without booktabs 
xtable(fit, booktabs = FALSE, max.row = 6)