[R-pkgs] new R packages for phylogenetic compartive methods
Krzysztof Bartoszek
krzb@r @end|ng |rom protonm@||@ch
Fri Apr 5 13:51:57 CEST 2019
Dear all,
I wanted to let you know about four phylogenetic comparative methods (PCM) packages that have become available on (3 on CRAN and 1 on GitHub) recently that hopefully will be interesting to somebody. Three of them go significantly beyond the Brownian motion (BM) and Ornstein-Uhlenbeck (OU) processes.
1) There is a new version of mvSLOUCH available. The most important change is that
the inference engine has been completely rewritten. Instead of directly evaluating the multivariate normal density formula it uses the PCMBase (R package, see below) to obtain the value of the likelihood in linear (in number of tip species) instead of quadratic time. Furthermore, as there is no need to store the between-species-between-traits variance covariance matrix much larger clades, then previously can be handled. From the user's perspective the main changes are the interface and increased functionality:
a) the phylogeny has to be now in the phylo (instead of ouch) format,
b) the trait data has to be a matrix (and not a data frame),
c) the package can automatically compare multiple models (BM, OUOU, OUBM and different assumed structures on the drift and diffusion matrices of the multivariate OU procees), function estimate.evolutionary.model(), and return the one with the lowest AICc (the set of models to consider can either be automatically generated or user defined, see generate.model.setups()),
d) the user can set entries of the OU model's parameters to desired values (e.g. 0) and this will be fixed through the whole estimation,
e) a number of control parameters to be passed to the estimation procedure and optim() are available for the user to manipulate
2) PCMFit: the goal of PCMFit is to provide a generic tool for inference and selection of phylogenetic comparative models (PCMs). Currently, the package implements Gaussian and mixed Gaussian phylogenetic models (MGPM) over all tree types (including non-ultrametric and polytomic trees). The package supports non-existing traits or missing measurements for some of the traits on some of the species. The package supports specifying measurement error associated with each tip of the tree or inferring a measurement error parameter for a group of tips. The Gaussian phylogenetic models include various parametrizations of Brownian motion (BM) and Ornstein-Uhlenbeck (OU) multivariate branching processes. The mixed Gaussian models represent models with shifts in the model parameters as well as the type of model at points of the tree. Each shift-point is described as a pair of a shift-node and associated type of model (e.g. OU or BM) driving the trait evolution from the beginning of the branch leading to the shift-node toward the shift-node and its descendants until reaching a tip or another shift-point. The function PCMFit is used to fit a given PCM or a MGPM for a given tree with specified shift-points. The function PCMFitMixed is used to fit an ensemble of possible MGPMs over a tree for which the shift-points are unknown. This function can perform model selection of the best MGPM for a given tree and data according to an information loss function such as the Akaike information criterion (AIC). The package has been thoroughly tested and applied to real data in the related research article entitled "Automatic Generation of Evolutionary Hypotheses using Mixed Gaussian Phylogenetic Models" (currently in review). Currently, the package is available from https://github.com/venelin/PCMFit . The web-page https://venelin.github.io/PCMFit/ provides access to documentation and related resources.
3) PCMBase: the computational engine that mvSLOUCH uses. Given a phylogeny (phylo format), traits' (multivariate) measurements and a user provided model of the traits' evolution the package calculates the likelihood. The family of allowed models is rather general. The package can handle any model for which lineages after speciation do not interact and the density of the transition along a branch is:
i) Gaussian
ii) the mean at the end of the branch depends linearly on the trait value at the start of the branch
iii) the covariance matrix does not depend on the value at the start of the branch
The likelihood is calculated in linear in number of tip species time.
The package is described in
Venelin Mitov, Krzysztof Bartoszek, Georgios Asimomitis, Tanja Stadler (2018).
Fast likelihood evaluation for multivariate phylogenetic comparative methods: the PCMBase R package
arXiv URL https://arxiv.org/abs/1809.09014 .
3) pcmabc: a package that allows for simulation and ABC estimation under any model for which the user can provide a function to simulate trait (discrete/continuous, uni- or multivariate) evolution along a branch. Special support is given for SDE based models, using the yuima package.
Krzysztof Bartoszek, Pietro Lio' (2019) Modelling trait dependent speciation with Approximate Bayesian Computation. Acta Physica Polonica B Proceedings Supplement 12(1): 25-47.
URL https://www.actaphys.uj.edu.pl/fulltext?series=Sup&vol=12&page=25 .
Hope somebody will find these useful
Best wishes
Krzysztof Bartoszek
Sent with ProtonMail Secure Email.
[[alternative HTML version deleted]]
More information about the R-packages
mailing list