[R-sig-eco] How to get the total variance explained from an envfit object?

Fri Nov 8 15:31:28 CET 2019

Dear Claire Elizabeth Couch,

First let me explain how envfit() works: For continuous environmental variables it actually uses ordination to predict the environmental variable. Under the hood (bonnet), it fits a first degree trend surface (plane in 2D) for environmental variable over the ordination scores, and the R2 is the proportion that surface explains of the variable. The arrow shown is the gradient of this fitted trend surface and shows the direction to which the variable changes most rapidly in a first degree linear model.

Clearly you cannot add these R2 values, because your environmental variables can be (and normally are) inter-correlated.

It seems that you want to work into another direction than envfit: Predict ordination scores by a set of environmental variables.

There are many ways of doing this in R, although we do not provide canned tool for this. You can actually do this even with multiple linear model with function lm() of R::stats. Here is how to do this with vegan::rda() function:

library(vegan)
data(varespec, varechem)
ord <- metaMDS(varespec, trace = FALSE)
fit <- rda(scores(ord) ~ ., data = varechem)

The basic output of this will show you R2:

Call: rda(formula = scores(ord) ~ N + P + K + Ca + Mg + S + Al + Fe +
Mn + Zn + Mo + Baresoil + Humdepth + pH, data = varechem)

              Inertia Proportion Rank
Total         0.13905    1.00000
Constrained   0.11357    0.81681    2
Unconstrained 0.02547    0.18319    2
Inertia is variance

Here the “Proportion” for the Constrained component is the overall R2 = 0.81681. If you want to see the adjusted R2, this is found with

 RsquareAdj(fit)
$r.squared
[1] 0.8168084

$adj.r.squared
[1] 0.5318436

However, you only get the overall R2, but not partial R2 values for single variables. You can use anova(fit, by = “margin”) to find the (lacking) marginal significances of unique effects of the variables, though.

The regression coefficients can be found with command coef() — and probably you want them normalized:

> coef(fit, norm=TRUE)
                 RDA1        RDA2
N        -0.154511701  0.48579513
P         0.002463991 -0.13179802
...
pH        0.701009027 -0.66724274

You can also simplify this model in the usual way, for instance with the function ordistep() that uses permutation test to drop variables one by one (or you can build up this model adding variables one by one if you start with an empty model with ordistep() or ordiR2step()):

ordistep(fit)            # drop variables
m0 <- update(fit, . ~ 1) # m0 is an empty model
ordistep(m0, scope=fit)  # add variables to an empty model

(After these anova(…, by = “margin”) results also give significant effects.)

All this sounds a bit weird to me (or more than “a bit”), but it can be done. I guess there are some readers who get hiccups for using RDA on NMDS, but this can be done as the NMDS space is metric (the *transfer* function is non-metric from community dissimilarities to metric ordination space). It also sounds really odd to have an ordination scores as dependent data in RDA, but this exactly answers the problem you presented: predict ordination scores by a set of external variables.  After all, RDA is nothing but a linear regression for multivariate response data, and there is no need to think it as an ordination.

Cheers, Jari Oksanen

On 7 Nov 2019, at 22:02, Couch, Claire Elizabeth <couchc using oregonstate.edu<mailto:couchc using oregonstate.edu>> wrote:

I am analyzing some microbiome data by using unconstrained ordination (PCA
or NMDS) followed by environmental vector fitting with the envfit function
in the vegan package. The output of envfit includes an r2 value for each
vector or factor included in the envfit model, but I am interested in the
total amount of variation explained by all the vectors/factors, rather than
just stand-alone variables. I presume I cannot simply add up the R2 values
assigned to each environmental variable, because there may be overlap in
the microbiome variation that is "explained" by each environmental
variable. However, there does not seem to be any way of accessing the total
r2 value for the model.

Using an example dataset, this is what I have tried so far:

library(vegan)
library(MASS)

data(varespec, varechem)
library(MASS)
ord <- metaMDS(varespec)
fit <- envfit(ord, varechem, perm = 999)
fit

This shows r2 for each environmental variable, but how do I extract the r2
value for the entire model?

I have tried running fit$r, attributes(fit)$r, and Rsquare.Adj(fit), but
these all return NULL.

I would greatly appreciate any suggestions!

[[alternative HTML version deleted]]

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology using r-project.org<mailto:R-sig-ecology using r-project.org>
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

	[[alternative HTML version deleted]]