[R-sig-eco] Covariate in MANYGLM

Thu Jan 21 01:19:05 CET 2016

Hi Martijn,
Thanks for your question.  Always delighted to help people playing with mvabund.

Let me start by complementing you on your excellent taste in software ;)

manyglm has been written to behave very much like the glm function (in fact it works by fitting glm's to each species).  So the answers to your questions are quite generic and apply to glm usage also.

First, how to account for different plot sizes - the best way to do this is actually using an offset, rather than an additional predictor.  For presence/absence data, the most natural way to do this is using the complementary log-log link function (which can be understood as modelling mean abundance from data truncated to presence/absence) and adding an offset for log(Area) such that abundance is assumed proportional to plot area.  If not sure about this assumption you could additionally put log(area) in as a predictor to see if there was any effect beyond proportionality.  Note we use log(Area) because the cloglog link, like the log link in Poisson regression, models effects on mean abundance on a log scale.  (Just on the off chance that are some distance-based users reading this e-mail - differences in sampling intensity, whether intended or otherwise, seem to be quite common in this type of data.  Yet I am yet to see a reliable way to account for such effects without specifying a formal statistical model to account for them, distance-based "fixes" frequently go awry.)

Anyway, using mvabund version 3.11.6 as currently on github you can fit such a model as follows:
	ft_cloglog = manyglm(birdmva~ Treatment+offset(log(Area)), data=x, family=binomial("cloglog"))
Unfortunately we recently found a bug in anova.manyglm for cloglog models - it is overriding the cloglog choice and fitting logistic regressions (logit link) to perform the anova.  It doesn't do this is manyglm, just in the anova.  Alice (Yi) Wang is working on this as we speak and I can let you know when the corrected code is on github.  Please contact me directly if you are having trouble loading the latest version from github.

As for your second question, yes results absolutely do change as you change the order in which terms are entered into the model.  If you are familiar with the concept of Type I Sums of Squares, this is more-or-less what the anova function does on R.  That is, it adds terms to the model sequentially and tests if the effect of adding each of these terms (in this sequence) is significant.  So for example in your model 1, you are testing for a (marginal) effect of moss, i.e. an effect of moss without anything else in the model, then you test for an effect of soil.dry given that you already have moss in the model.  Your model 2 does the reverse, testing for a marginal effect of soil.dry, then testing for an effect of moss given soil.dry.  So what is more appropriate for your context - well presumably you are interested in the effect of Treatment after accounting for the effect of sampling intensity (via different areas), so if you have a term in your model for log(Area), it should appear before Treatment in your manyglm call.

All the best
David

David Warton
Professor and Australian Research Council Future Fellow
School of Mathematics and Statistics, Evolution & Ecology Research Centre, Centre for Ecosystem Science
The University of New South Wales NSW 2052 AUSTRALIA
phone (61)(2) 9385-7031
fax (61)(2) 9385-7123

http://www.eco-stats.unsw.edu.au/ecostats15.html

------------------------------

Date: Wed, 20 Jan 2016 09:30:59 +0000
From: Martijn Versluijs <martijn.versluijs at slu.se>
To: "r-sig-ecology at r-project.org" <r-sig-ecology at r-project.org>
Subject: [R-sig-eco] Covariate in MANYGLM
Message-ID: <aa40c173c6f7409587b8f3f5781b7fa3 at exch2-4.slu.se>
Content-Type: text/plain; charset="UTF-8"

Dear all,
I try to fit MANYGLM models to my species community data. I am working with bird data and want to see if forest restoration lead to changes in bird communities. I have 4 treatments and 10 stand per treatment. I work with presence/absence data. The 40 plots I use differ in size (between 3-22 ha), therefore I want to correct for plot-size in the model. My first question is if it is possible to include a covariate (plot-size) in a MANYGLM model. Until now I treated MANYGLM just like a normal GLM and expected that the model correct for plot-size but I am not sure how it works.
Model:
< manyglm(birdmva~Area+Treatment, data=x, family="binomial")

Secondly, I also want to know which species contribute to the differences in community structure. But I am a little confused about the univariate test! First (if possible in MANYGLM), what to do with the covariate. Secondly, how do you determine the order of the variables in the model, if you change order in the global MANYGLM model, univariate outcome chance significantly. Both multivariate test and at species level.
< anova.manyglm(model, test="LR")

Example with spider dataset:

Univariate test model 1
Analysis of Deviance Table

Model: manyglm(formula = spiddat ~ moss + soil.dry, data=x, family = "neg.binomial")

Multivariate test:
            Res.Df Df.diff    Dev Pr(>Dev)
(Intercept)     27
moss            26      1   71.86    0.001 ***
soil.dry        25      1   104.51    0.001 ***

Univariate test model 2

Model: manyglm(formula = spiddat ~ soil.dry + moss, data=x, family = "neg.binomial")

Multivariate test:

            Res.Df Df.diff    Dev Pr(>Dev)

(Intercept)     27

Soil.dry        26       1  147.30    0.001 ***

Moss            25       1   29.07    0.068 .

Moss is significant in the first model but not in the second!

This is the first time I work with MANYGLM, thus any assistance would help.

Thanks in advance.

Martijn Versluijs
__________________________________________________________________________________
PhD-Student
Department of Wildlife, Fish, and Environmental Studies (Vilt, fisk och milj?aculty of Forest Sciences (Skogsvetenskapliga fakulteten) Swedish University of Agricultural Sciences (SLU)
S-901 83 Ume?Sweden

	[[alternative HTML version deleted]]

------------------------------

Subject: Digest Footer

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

------------------------------

End of R-sig-ecology Digest, Vol 94, Issue 8