[R-sig-eco] Covariate in MANYGLM

Thu Jan 21 18:03:20 CET 2016

David,

Thanks for your response to Martijn's question. I too am interested in
using manyglm in analyzing multifactor experiments, and I think I've
figured out how to do this. One question though regarding the model and
sequential sums of squares. I think understand correctly that the anova
results for the last term (I.e., factor) in the model are equivalent to
partial sums of squares results. If so, you would just re-run the
analysis, putting a different factor last in the model to get partial sums
of squares results for it. However, when trying to figure which species
contribute most to the multivariate results, it seems to me that their
responses (and the associated stats, e.g., 2*log-likelihood) are
associated with response to all the terms in the model combined, right? If
it is, is it possible to rank species in terms of their responses to the
particular factor of interest, while still accounting for the other
factors (or covariate, as in Martijn's case)?

If I'm not making myself clear, I guess what I'm asking is, using
Martijn's ancova example, how do you rank species in terms of their least
squares or corrected mean responses to the treatment (I.e., corrected for
the covariate)?

Thanks,
Steve

J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-202-5877

On 1/20/16 6:19 PM, "David Warton" <david.warton at unsw.edu.au> wrote:

>Hi Martijn,
>Thanks for your question.  Always delighted to help people playing with
>mvabund.
>
>Let me start by complementing you on your excellent taste in software ;)
>
>manyglm has been written to behave very much like the glm function (in
>fact it works by fitting glm's to each species).  So the answers to your
>questions are quite generic and apply to glm usage also.
>
>First, how to account for different plot sizes - the best way to do this
>is actually using an offset, rather than an additional predictor.  For
>presence/absence data, the most natural way to do this is using the
>complementary log-log link function (which can be understood as modelling
>mean abundance from data truncated to presence/absence) and adding an
>offset for log(Area) such that abundance is assumed proportional to plot
>area.  If not sure about this assumption you could additionally put
>log(area) in as a predictor to see if there was any effect beyond
>proportionality.  Note we use log(Area) because the cloglog link, like
>the log link in Poisson regression, models effects on mean abundance on a
>log scale.  (Just on the off chance that are some distance-based users
>reading this e-mail - differences in sampling intensity, whether intended
>or otherwise, seem to be quite common in this type of data.  Yet I am yet
>to see a reliable way to account for such effects without specifying a
>formal statistical model to account for them, distance-based "fixes"
>frequently go awry.)
>
>Anyway, using mvabund version 3.11.6 as currently on github you can fit
>such a model as follows:
>	ft_cloglog = manyglm(birdmva~ Treatment+offset(log(Area)), data=x,
>family=binomial("cloglog"))
>Unfortunately we recently found a bug in anova.manyglm for cloglog models
>- it is overriding the cloglog choice and fitting logistic regressions
>(logit link) to perform the anova.  It doesn't do this is manyglm, just
>in the anova.  Alice (Yi) Wang is working on this as we speak and I can
>let you know when the corrected code is on github.  Please contact me
>directly if you are having trouble loading the latest version from github.
>
>As for your second question, yes results absolutely do change as you
>change the order in which terms are entered into the model.  If you are
>familiar with the concept of Type I Sums of Squares, this is more-or-less
>what the anova function does on R.  That is, it adds terms to the model
>sequentially and tests if the effect of adding each of these terms (in
>this sequence) is significant.  So for example in your model 1, you are
>testing for a (marginal) effect of moss, i.e. an effect of moss without
>anything else in the model, then you test for an effect of soil.dry given
>that you already have moss in the model.  Your model 2 does the reverse,
>testing for a marginal effect of soil.dry, then testing for an effect of
>moss given soil.dry.  So what is more appropriate for your context - well
>presumably you are interested in the effect of Treatment after accounting
>for the effect of sampling intensity (via different areas), so if you
>have a term in your model for log(Area), it should appear before
>Treatment in your manyglm call.
>
>All the best
>David
>
>
> 
>David Warton
>Professor and Australian Research Council Future Fellow
>School of Mathematics and Statistics, Evolution & Ecology Research
>Centre, Centre for Ecosystem Science
>The University of New South Wales NSW 2052 AUSTRALIA
>phone (61)(2) 9385-7031
>fax (61)(2) 9385-7123
> 
>http://www.eco-stats.unsw.edu.au/ecostats15.html
>
> 
>
>------------------------------
>
>Date: Wed, 20 Jan 2016 09:30:59 +0000
>From: Martijn Versluijs <martijn.versluijs at slu.se>
>To: "r-sig-ecology at r-project.org" <r-sig-ecology at r-project.org>
>Subject: [R-sig-eco] Covariate in MANYGLM
>Message-ID: <aa40c173c6f7409587b8f3f5781b7fa3 at exch2-4.slu.se>
>Content-Type: text/plain; charset="UTF-8"
>
>Dear all,
>I try to fit MANYGLM models to my species community data. I am working
>with bird data and want to see if forest restoration lead to changes in
>bird communities. I have 4 treatments and 10 stand per treatment. I work
>with presence/absence data. The 40 plots I use differ in size (between
>3-22 ha), therefore I want to correct for plot-size in the model. My
>first question is if it is possible to include a covariate (plot-size) in
>a MANYGLM model. Until now I treated MANYGLM just like a normal GLM and
>expected that the model correct for plot-size but I am not sure how it
>works.
>Model:
>< manyglm(birdmva~Area+Treatment, data=x, family="binomial")
>
>Secondly, I also want to know which species contribute to the differences
>in community structure. But I am a little confused about the univariate
>test! First (if possible in MANYGLM), what to do with the covariate.
>Secondly, how do you determine the order of the variables in the model,
>if you change order in the global MANYGLM model, univariate outcome
>chance significantly. Both multivariate test and at species level.
>< anova.manyglm(model, test="LR")
>
>Example with spider dataset:
>
>Univariate test model 1
>Analysis of Deviance Table
>
>Model: manyglm(formula = spiddat ~ moss + soil.dry, data=x, family =
>"neg.binomial")
>
>Multivariate test:
>            Res.Df Df.diff    Dev Pr(>Dev)
>(Intercept)     27
>moss            26      1   71.86    0.001 ***
>soil.dry        25      1   104.51    0.001 ***
>
>
>Univariate test model 2
>
>Model: manyglm(formula = spiddat ~ soil.dry + moss, data=x, family =
>"neg.binomial")
>
>
>
>Multivariate test:
>
>            Res.Df Df.diff    Dev Pr(>Dev)
>
>(Intercept)     27
>
>Soil.dry        26       1  147.30    0.001 ***
>
>Moss            25       1   29.07    0.068 .
>
>Moss is significant in the first model but not in the second!
>
>This is the first time I work with MANYGLM, thus any assistance would
>help.
>
>Thanks in advance.
>
>Martijn Versluijs
>__________________________________________________________________________
>________
>PhD-Student
>Department of Wildlife, Fish, and Environmental Studies (Vilt, fisk och
>milj?aculty of Forest Sciences (Skogsvetenskapliga fakulteten) Swedish
>University of Agricultural Sciences (SLU)
>S-901 83 Ume?Sweden
>
>
>	[[alternative HTML version deleted]]
>
>
>
>------------------------------
>
>Subject: Digest Footer
>
>_______________________________________________
>R-sig-ecology mailing list
>R-sig-ecology at r-project.org
>https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>------------------------------
>
>End of R-sig-ecology Digest, Vol 94, Issue 8
>
>_______________________________________________
>R-sig-ecology mailing list
>R-sig-ecology at r-project.org
>https://stat.ethz.ch/mailman/listinfo/r-sig-ecology