[R-sig-eco] adonis or other multi-variate approaches to examine community

David Warton david.warton at unsw.edu.au
Fri Apr 10 00:34:10 CEST 2015


Hi Chris,
I'm sorry to say that you have opened up a big can of worms.  The short answer is that adonis tests for interaction do not have any sensible meaning in terms of the originally measured variables, once you get beyond the Eudlidean case (or something that can be interpreted as Euclidean after some hopefully sensible-sounding transformation).  This is before you start thinking about whether to stratify when resampling.

There were actually two mistakes in your example simulation:
- it doesn't make sense to just add 50 or add 100 to abundances while keeping all else constant.  Species with higher means also have higher variances, and different shapes to their distributions.  A more realistic scenario would be to simulate from a distribution which seems to work pretty well in practice, like the negative binomial, varying the mean across groups.  (I would add that a more sensible definition of main effect for abundance data would be to multiply the mean through by a constant, not add a constant.)
- in your final adonis call, you hadn't actually introduced any interaction terms, just main effects for treatment and time.  So you should have been looking for significant main effects for treatment and time and no significant interaction - what you actually got was everything significant (including the interaction).  I have found similar results myself, when simulating from a range of distributions, adonis often incorrectly declares a significant interaction when simulating under main effects models.  The reason is that when you get beyond the Euclidean case, the idea that adonis has in mind as a main effects model is different to anything we mere mortals would think of as a main effects model, so when we generate data under what we understand as main effects, adonis thinks otherwise.  This is not an issue with the adonis function itself, it is a broader issue with the whole distance-based framework.

So if you want to test for interactions in a way that has some meaningful interpretation on the scale of the originally measured variables, the options as I see it are to use a statistical model, or you could maybe try some quick-and-dirty transformation to Euclidean (but which has its own host of problems).

(You also asked why treatment went non-significant when you stratified on site - the reason for that is that you were no longer resampling in a way that changes treatment labels across sites, hence no test for treatment was available.)

All the best
David

 
David Warton
Professor and Australian Research Council Future Fellow
School of Mathematics and Statistics and the Evolution & Ecology Research Centre
The University of New South Wales NSW 2052 AUSTRALIA
phone (61)(2) 9385-7031
fax (61)(2) 9385-7123
 
http://www.eco-stats.unsw.edu.au/ecostats15.html



------------------------------

Message: 3
Date: Wed, 8 Apr 2015 16:33:43 -0500
From: Chris Holmes <holmess1 at illinois.edu>
To: <r-sig-ecology at r-project.org>
Subject: [R-sig-eco] adonis or other multi-variate approaches to
	examine community change over time given main effect of treatment
Message-ID: <55259EB7.70502 at illinois.edu>
Content-Type: text/plain; charset="UTF-8"

Hello All,
I have a large data set where I am interested in answering the question of whether initial conditions (i.e. our initial treatments) influenced the community composition over time. We have 10 replicates for each treatment and species abundances for 15+ species. My question is, I have seen approaches to use adonis and capscale to analyze the question of do sites change over time (omitting the treatment factor, as I have in my design). A summary of those codes can be found in this paper:
http://www.esajournals.org/doi/full/10.1890/10-2138.1

However, my question is, is what I seek to do able to be properly done using adonis and/or capscale? In short, I am interested in the interactive treatment x time interaction.

For example, Consider a case in which I expect to see no effect of time or treatment:
#DATA
speciesmatrix <- matrix(rnorm(3 * 6 * 20, 50, 10), nrow = 3 * 6, ncol = 20,
          dimnames = list(1:18, paste("Sp", 1:20, sep = ""))) #TIME AND TREATMENT VECTORS time <- as.ordered(rep(1:3, 6)) site <- gl(6, 3) treatment <- gl(3,6) #ADONIS adonis(speciesmatrix~treatment*time,permutations=1000, method="bray")

Call:
adonis(formula = speciesmatrix ~ treatment * time, permutations = 1000, method = "bray")

Contrast this when I add an effect of treatment by adding 100 to each treatment 3:
#Add 100 to each species in treatment 3
speciesmatrix[treatment==3,] <-  speciesmatrix[treatment==3,] + 100

#Adonis
adonis(speciesmatrix~treatment*time,permutations=1000, method="bray")

After this, I see an effect of treatment. Great!
Now let's add 50 to each species in time 2, and 75 to each species at time 3 (still having the effect of treatment by 100 being added to treatment 3 in the previous analysis) to see if this will produce a treatment x time effect speciesmatrix[time==2,] <- speciesmatrix[time==2,] + 50 speciesmatrix[time==3,] <- speciesmatrix[time==3,] + 75

adonis(speciesmatrix~treatment*time,permutations=1000, method="bray") According to this, I see the effect that I expected. However, my question is, is this the proper way to analyze this data? I have seen others treat each site as a strata, but when I incorporate those into my models the results I expect to see given my toy data fall apart. For example, I wouldn't see an effect of treatment after adding 100 to treatment 3.

Any help would be very much appreciated because I have been struggling with this problem for a long time!

	[[alternative HTML version deleted]]



More information about the R-sig-ecology mailing list