[R-sig-ME] Course "Introduction to Mixed (Hierarchical) models for biologists using R"

Oliver Hooker oliverhooker at prstatistics.com
Mon Nov 6 13:04:28 CET 2017


Introduction to Mixed (Hierarchical) models for biologists using R 
(IMBR01)

https://www.prstatistics.com/course/introduction-to-mixed-hierarchical-models-for-biologists-using-r-imbr01/

14th May 2018 - 18th May 2018

Prof. Subhash Lele

Course overview:
Mixed models, also known as hierarchical models and multilevel models, 
is a useful class of models for many applied sciences, including 
biology, ecology and evolution. The goal of this course is to give a 
thorough introduction to the logic, theory and most importantly 
implementation of these models to solve practical problems in ecology. 
Participants are not expected to know mathematics beyond the basic 
algebra and calculus. Participants are expected to know some R 
programming and to be familiar with the linear and generalized linear 
regression. We will be using JAGS (Just Another Gibbs Sampler) for 
Markov Chain Monte Carlo (MCMC) simulations for analyzing mixed models. 
The course will be conducted so that participants have substantial 
hands-on experience.

Monday 14th
Linear and Generalized linear models
To understand mixed models, the most important first step is to 
thoroughly understand the linear and generalized linear models. Also, 
when conducting the data analysis, it is useful to fit a simpler fixed 
effects model before trying to fit a more complex mixed effects 
model. Hence, we will start with a very detailed review of these models. 
We are assuming that the participants are familiar with these models and 
hence we will emphasize some important, but not commonly covered, 
topics. This will also give us an opportunity to unify the notation, 
review the basic R commands and fill out any gaps in knowledge and 
understanding of these topics.
1. We will show the use of non-parametric exploratory techniques such as 
classification and regression trees (CART) for learning about important 
covariates and possible non-linearities in the relationships.
2. We will emphasize graphical and simulation based methods (e.g. Gelman 
and Hill, 2006) to understand and explore the implications of the 
fitted model.
3. We will discuss graphical tools such as marginal and conditional 
plots that are useful for conveying the results of a multiple regression 
model to a lay person.
4. We will emphasize the use of graphical tools to conduct regression 
diagnostics and appropriateness of the model.
5. We will discuss the important concepts of confounding, effect 
modification and interaction. These are particularly important to 
conduct causal, not just correlational, inference using observational 
studies.

Tuesday 15th
Computational inference
Many of the topics that will be covered involve the use of matrix 
algebra and calculus. While these mathematical techniques are essential 
tools for a mathematical statistician who is trying to understand the 
theory behind the methods, they can be avoided in practice by using 
simulation based techniques. The built-in functions such as the ’lm’ 
and ’glm’ to fit the regression models use the method of maximum 
likelihood to estimate the parameters and conduct statistical inference. 
We will discuss the use of JAGS (Just Another Gibbs Sampler) and the R 
package ’dclone’ to fit the same models. We will use a different 
statistical philosophy, namely the Bayesian inference, to fit these 
models. We will show how the Bayesian approach can be tricked into 
giving frequentist answers using data cloning (Lele et al. 2007, Ecology 
Letters). We will also discuss the rudiments of frequentist and Bayesian 
inference although we will not go into the pros and cons of them at this 
time. That will be covered during sessions 3 and 4 of the fifth day 
(and, over beer afterwards).
1. What makes an inference statistical inference?
2. What do we mean by probability of an event?
3. How do we quantify uncertainty in an inferential statement in the 
frequentist framework?
4. How do we quantify uncertainty in an inferential statement in the 
Bayesian framework?
We will then discuss the simulation based methods to quantify 
uncertainty.
1. Parametric bootstrap to quantify frequentist uncertainty
2. Markov Chain Monte Carlo to quantify Bayesian uncertainty
3. Fitting LM and GLM using JAGS and Bayesian approach

Wednesday 16th
Linear Mixed Models
Historically, linear mixed models arose in the study of quantitative 
genetics and heritability issues. They were successfully applied in 
animal breeding and led to the ’white’ revolution with abundance of 
milk supply for the developing world. They were, also, used in horse 
racing and other such fun areas. The other situation where linear mixed 
effects models were developed were in the context of growth curves. We 
will follow this historical trajectory of mixed models, paying tribute 
to the great statisticians R. A. Fisher, C. R. Rao and Jerzy Neyman, and 
study linear mixed models first. The questions they tried to solve 
were: Deciding the genetic value of a sire and/or a dam, studying 
heritability of traits, studying co-evolution of traits etc. These can 
be answered provided we assume that the sires and dams in our experiment 
or sample are merely a sample from a super-population of sires and dams. 
In growth curve analysis, we need to take into account that each 
individual is unique in its own way but is also a part of a population. 
How do we discuss both individual level and population inferences? In 
modern times, linear mixed effects models have arisen in the context of 
small area estimation in survey sampling where one is interested in 
inferring about a census tract based on county or state level data. 
These models arise also in the context of combining remote sensed data 
from different resolutions and types. The main issues that we will be 
discussing are:
1. What is a random effect? What is a fixed effect? How do we decide 
if an effect is random or fixed?
2. How do we modify a linear regression model to accommodate random 
effects?
3. Why bother fitting a mixed effects models? What do we gain?
4. How to modify the JAGS linear models program to fit a linear mixed 
effects model using JAGS?
5. What is the difference between a Bayesian and a frequentist 
inference?
6. What is a prior? What is a non-informative prior?
7. How do we interpret the results of a linear mixed effects model 
fit? Graphical and simulation based methods
8. How do we do model selection with mixed effects models?
9. How do we do model diagnostics in mixed effects models?
10. Parameter identifiabilty issues in linear mixed models
As we discuss these applications, we will discuss some subtle 
computational issues involved in using MCMC. In my recollection (which 
may be biased as it has been about 25 years since the quote), Daryl 
Pregibon said: MCMC is the crack cocaine of modern statistics; it is 
addictive, seductive and destructive. Hence, it is important for a 
practitioner to understand these issues in order not to misuse the MCMC 
technique.
1. What is a Markov Chain Monte Carlo method? Why is it necessary for 
mixed models?
2. What are the subtleties in implementing MCMC?: Convergence of the 
algorithm, Mixing of the chains.
3. Pros and cons of using MCMC

Thursday 17th
Generalised Linear Mixed Models
We will again start the discussion of GLMM in its historical context. 
One of the initial uses of mixed models were in the context of over 
dispersion in count data. Zero inflated count data was another 
important example. The example that drove the current revolution in the 
use of GLMM was in the context of spatial epidemiology. Clayton and 
Caldor (1989, Biometrics) showed that one can use spatial correlation to 
improve the prediction in mapping disease rates. This was also an 
example of the application of Empirical Bayes methods that allow one to 
pool information from different spatial areas (or, studies, or, scales, 
and so on).
1. Zero inflated data In many practical situations, we observe that 
there are many locations where there are zero counts, far in excess of 
what would be expected under the Poisson regression model. This can be 
effectively modelled using a mixed model framework. The mixed models 
framework allows us to use much more complex and realistic models.
2. Over dispersion in GLM, Spatial GLM, Spatio-temporal GLM The Poisson 
regression model assumes that the mean and variance are equal. This is, 
often, not true in practice. Generally the variance in the data exceeds 
the mean. One can show that such over-dispersion can be modelled using a 
mixed effects model. These models also arise in the context of 
capturerecapture sampling where capture probabilities vary across space 
or time or individuals.
3. Longitudinal or panel data with discrete response variable Many times 
we have data on different individuals where within the individual there 
is temporal dependence but individuals are independent of each other. 
Cluster sampling is another situation where we have dependence within a 
cluster but independence between clusters. Such data needs to take into 
account the innate variation between individuals before one can discuss 
the effect of interesting covariates or risk factors. Such data are 
effectively modelled as GLMM.
4. Measurement error, missing data Missing data and measurement error 
are ubiquitous in ecological studies. Mixed models provide a convenient 
way to take into account these difficulties and infer about the 
underlying processes of interest. We will discuss these issues in the 
context of Population Viability Analysis, Spatial population dynamics 
and source-sink analysis, Occupancy and abundance surveys. These also 
arise while doing usual linear and generalized linear models if the 
covariates are measured with error.
5. Additional topics depending on the interest of the participants. 
These may include, for example, discussion of Species Distribution 
Models, Resource Selection Functions and Animal movement models.
6 Computational issues: Advanced topics

Friday 18th
Mixed Models in a Bayesian Framework
MCMC is not the only approach to analyse mixed models. We will briefly 
discuss Laplace approximation based techniques (INLA, in particular) 
along with approximate techniques such as Composite likelihood and 
Approximate Bayesian Computation. Because of the mathematical nature, 
this discussion will be somewhat limited, only giving the basics and 
hinting at the important issues.
7 Philosophical issues: Sophie’s choice
1. What are the philosophical problems with using the frequentist 
quantification of uncertainty?
2. What are the philosophical problems with using the Bayesian 
quantification of uncertainty?
3. Sophie’s choice?

Other upcoming courses

1.	November 6th – 10th 2017
LANDSCAPE GENETIC DATA ANALYSIS USING R #LNDG
Margam Discovery Centre, Wales, Prof. Rodney Dyer
http://www.prstatistics.com/course/landscape-genetic-data-analysis-using-r-lndg02/

2.	November 20th - 25th 2017
APPLIED BAYESIAN MODELLING FOR ECOLOGISTS AND EPIDEMIOLOGISTS #ABME
SCENE, Scotland, Dr. Matt Denwood
http://www.prstatistics.com/course/applied-bayesian-modelling-ecologists-epidemiologists-abme03/

3.	November 27th – December 1st 2017
INTRODUCTION TO PYTHON FOR BIOLOGISTS #IPYB
Margam Discovery Centre, Wales, Dr. Martin Jones
http://www.prinformatics.com/course/introduction-to-python-for-biologists-ipyb04/
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
4.	December 4th - 8th 2017
ADVANCING IN STATISTICAL MODELLING USING R #ADVR
Margam Discovery Centre, Wales, Dr. Luc Bussiere, Dr. Tom Houslay, Dr. 
Ane Timenes Laugen,
http://www.prstatistics.com/course/advancing-statistical-modelling-using-r-advr07/
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
5.	January 29t – February 2nd 2018
INTRODUCTION TO BAYESIAN HIERARCHICAL MODELLING #IBHM
SCENE, Scotland, Dr. Andrew Parnell
http://www.prstatistics.com/course/introduction-to-bayesian-hierarchical-modelling-using-r-ibhm02/

6.	January 29th – February 2nd 2018
PHYLOGENETIC DATA ANALYSIS USING R #PHYL
SCENE, Scotland, Dr. Emmanuel Paradis
https://www.prstatistics.com/course/introduction-to-phylogenetic-analysis-with-r-phyg-phyl02/
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
7.	February 19th – 23rd 2018
MOVEMENT ECOLOGY #MOVE
Margam Discovery Centre, Wales, Dr Luca Borger, Dr Ronny Wilson, Dr 
Jonathan Potts
https://www.prstatistics.com/course/movement-ecology-move01/

8.	February 19th – 23rd 2018
GEOMETRIC MORPHOMETRICS USING R #GMMR
Margam Discovery Centre, Wales, Prof. Dean Adams, Prof. Michael Collyer, 
Dr. Antigoni Kaliontzopoulou
http://www.prstatistics.com/course/geometric-morphometrics-using-r-gmmr01/
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
9.	March 5th - 9th 2018
SPATIAL PRIORITIZATION USING MARXAN #MRXN
Margam Discovery Centre, Wales, Jennifer McGowan
https://www.prstatistics.com/course/introduction-to-marxan-mrxn01/

10.	March 12th - 16th 2018
ECOLOGICAL NICHE MODELLING USING R #ENMR
Glasgow, Scotland, Dr. Neftali Sillero
http://www.prstatistics.com/course/ecological-niche-modelling-using-r-enmr02/

11.	March 19th – 23rd 2018
BEHAVIOURAL DATA ANALYSIS USING MAXIMUM LIKLIHOOD IN R #BDML
Glasgow, Scotland, Dr William Hoppitt
http://www.psstatistics.com/course/behavioural-data-analysis-using-maximum-likelihood-bdml01/
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
12.	April 9th – 13th 2018
NETWORK ANAYLSIS FOR ECOLOGISTS USING R #NTWA
Glasgow, Scotland, Dr. Marco Scotti
https://www.prstatistics.com/course/network-analysis-ecologists-ntwa02/

13.	April 16th – 20th 2018
INTRODUCTION TO STATISTICAL MODELLING FOR PSYCHOLOGISTS USING R #IPSY
Glasgow, Scotland, Dr. Dale Barr, Dr Luc Bussierre
http://www.psstatistics.com/course/introduction-to-statistics-using-r-for-psychologists-ipsy01/

14.	April 23rd – 27th 2018
MULTIVARIATE ANALYSIS OF ECOLOGICAL COMMUNITIES USING THE VEGAN PACKAGE 
#VGNR
Glasgow, Scotland, Dr. Peter Solymos, Dr. Guillaume Blanchet
https://www.prstatistics.com/course/multivariate-analysis-of-ecological-communities-in-r-with-the-vegan-package-vgnr01/

15.	April 30th – 4th May 2018
QUANTITATIVE GEOGRAPHIC ECOLOGY: MODELING GENOMES, NICHES, AND 
COMMUNITIES #QGER
Glasgow, Scotland, Dr. Dan Warren, Dr. Matt Fitzpatrick
https://www.prstatistics.com/course/quantitative-geographic-ecology-using-r-modelling-genomes-niches-and-communities-qger01/
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
16.	May 7th – 11th 2018 ADVANCES IN MULTIVARIATE ANALYSIS OF SPATIAL 
ECOLOGICAL DATA USING R #MVSP
CANADA (QUEBEC), Prof. Pierre Legendre, Dr. Guillaume Blanchet
https://www.prstatistics.com/course/advances-in-spatial-analysis-of-multivariate-ecological-data-theory-and-practice-mvsp03/

17.	May 14th - 18th 2018
INTRODUCTION TO MIXED (HIERARCHICAL) MODELS FOR BIOLOGISTS #IMBR
CANADA (QUEBEC), Prof Subhash Lele
https://www.prstatistics.com/course/introduction-to-mixed-hierarchical-models-for-biologists-using-r-imbr01/

18.	May 21st - 25th 2018
INTRODUCTION TO PYTHON FOR BIOLOGISTS #IPYB
SCENE, Scotland, Dr. Martin Jones
http://www.prinformatics.com/course/introduction-to-python-for-biologists-ipyb05/

19.	May 21st - 25th 2018
INTRODUCTION TO REMOTE SENISNG AND GIS FOR ECOLOGICAL APPLICATIONS
Glasgow, Scotland, Prof. Duccio Rocchini, Dr. Luca Delucchi
https://www.prinformatics.com/course/introduction-to-remote-sensing-and-gis-for-ecological-applications-irms01/

20.	May 28th – 31st 2018
STABLE ISOTOPE MIXING MODELS USING SIAR, SIBER AND MIXSIAR #SIMM
CANADA (QUEBEC) Dr. Andrew Parnell, Dr. Andrew Jackson
https://www.prstatistics.com/course/stable-isotope-mixing-models-using-r-simm04/

21.	May 28th – June 1st 2018
ADVANCED PYTHON FOR BIOLOGISTS #APYB
SCENE, Scotland, Dr. Martin Jones
https://www.prinformatics.com/course/advanced-python-biologists-apyb02/
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
22.	June 12th -0 15th 2018
SPECIES DISTRIBUTION MODELLING #DBMR
Myuna Bay sport and recreation, Australia, TBC
COMING SOON  www.PRstatistics.com

23.	June 12th – 15th 2018
MARK RECAPTURE METHODS IN ECOLOGY #MKRC
Myuna Bay sport and recreation, Australia, TBC
COMING SOON  www.PRstatistics.com

24.	June 18th – 22nd 2018
STRUCTURAL EQUATION MODELLING FOR ECOLOGISTS AND EVOLUTIONARY BIOLOGISTS 
USING R #SEMR
Myuna Bay sport and recreation, Australia, TBC
COMING SOON  www.PRstatistics.com
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
25.	July 2nd - 5th 2018
SOCIAL NETWORK ANALYSIS FOR BEHAVIOURAL SCIENTISTS USING R #SNAR
Glasgow, Scotland, Prof James Curley
http://www.psstatistics.com/course/social-network-analysis-for-behavioral-scientists-snar01/

26.	July 8th – 12th 2018
MODEL BASE MULTIVARIATE ANALYSIS OF ABUNDANCE DATA USING R #MBMV
Glasgow, Scotland, Prof David Warton
https://www.prstatistics.com/course/model-base-multivariate-analysis-of-abundance-data-using-r-mbmv02/

27.	July 16th – 20th 2018
PRECISION MEDICINE BIOINFORMATICS: FROM RAW GENOME AND TRANSCRIPTOME 
DATA TO CLINICAL INTERPRETATION #PMBI
Glasgow, Scotland, Dr Malachi Griffith, Dr. Obi Griffith
COMING SOON www.prinformatics.com

28.	July 23rd – 27th 2018
EUKARYOTIC METABARCODING
Glasgow, Scotland, Dr. Owen Wangensteen
http://www.prinformatics.com/course/eukaryotic-metabarcoding-eukb01/


-- 
Oliver Hooker PhD.
PR statistics

2017 publications -

Ecosystem size predicts eco-morphological variability in post-glacial 
diversification. Ecology and Evolution. In press.

The physiological costs of prey switching reinforce foraging 
specialization. Journal of animal ecology.

prstatistics.com
facebook.com/prstatistics/
twitter.com/PRstatistics
groups.google.com/d/forum/pr-statistics-post-course-forum
prstatistics.com/organiser/oliver-hooker/

6 Hope Park Crescent
Edinburgh
EH8 9NA

+44 (0) 7966500340



More information about the R-sig-mixed-models mailing list