[R-sig-eco] How to get the total variance explained from an envfit object?
Couch, Claire Elizabeth
couchc @end|ng |rom oregon@t@te@edu
Fri Nov 8 16:36:44 CET 2019
Dr. Oksanen, thank you so much for taking the time to offer such a clear
and detailed response. Very helpful!
On Fri, Nov 8, 2019 at 06:31 Jari Oksanen <jari.oksanen using oulu.fi> wrote:
> Dear Claire Elizabeth Couch,
>
> First let me explain how envfit() works: For continuous environmental
> variables it actually uses ordination to predict the environmental
> variable. Under the hood (bonnet), it fits a first degree trend surface
> (plane in 2D) for environmental variable over the ordination scores, and
> the R2 is the proportion that surface explains of the variable. The arrow
> shown is the gradient of this fitted trend surface and shows the direction
> to which the variable changes most rapidly in a first degree linear model.
>
> Clearly you cannot add these R2 values, because your environmental
> variables can be (and normally are) inter-correlated.
>
> It seems that you want to work into another direction than envfit: Predict
> ordination scores by a set of environmental variables.
>
> There are many ways of doing this in R, although we do not provide canned
> tool for this. You can actually do this even with multiple linear model
> with function lm() of R::stats. Here is how to do this with vegan::rda()
> function:
>
> library(vegan)
> data(varespec, varechem)
> ord <- metaMDS(varespec, trace = FALSE)
> fit <- rda(scores(ord) ~ ., data = varechem)
>
> The basic output of this will show you R2:
>
> Call: rda(formula = scores(ord) ~ N + P + K + Ca + Mg + S + Al + Fe +
> Mn + Zn + Mo + Baresoil + Humdepth + pH, data = varechem)
>
> Inertia Proportion Rank
> Total 0.13905 1.00000
> Constrained 0.11357 0.81681 2
> Unconstrained 0.02547 0.18319 2
> Inertia is variance
>
> Here the “Proportion” for the Constrained component is the overall R2 =
> 0.81681. If you want to see the adjusted R2, this is found with
>
> RsquareAdj(fit)
> $r.squared
> [1] 0.8168084
>
> $adj.r.squared
> [1] 0.5318436
>
> However, you only get the overall R2, but not partial R2 values for single
> variables. You can use anova(fit, by = “margin”) to find the (lacking)
> marginal significances of unique effects of the variables, though.
>
> The regression coefficients can be found with command coef() — and
> probably you want them normalized:
>
> > coef(fit, norm=TRUE)
> RDA1 RDA2
> N -0.154511701 0.48579513
> P 0.002463991 -0.13179802
> ...
> pH 0.701009027 -0.66724274
>
> You can also simplify this model in the usual way, for instance with the
> function ordistep() that uses permutation test to drop variables one by one
> (or you can build up this model adding variables one by one if you start
> with an empty model with ordistep() or ordiR2step()):
>
> ordistep(fit) # drop variables
> m0 <- update(fit, . ~ 1) # m0 is an empty model
> ordistep(m0, scope=fit) # add variables to an empty model
>
> (After these anova(…, by = “margin”) results also give significant
> effects.)
>
> All this sounds a bit weird to me (or more than “a bit”), but it can be
> done. I guess there are some readers who get hiccups for using RDA on NMDS,
> but this can be done as the NMDS space is metric (the *transfer* function
> is non-metric from community dissimilarities to metric ordination space).
> It also sounds really odd to have an ordination scores as dependent data in
> RDA, but this exactly answers the problem you presented: predict ordination
> scores by a set of external variables. After all, RDA is nothing but a
> linear regression for multivariate response data, and there is no need to
> think it as an ordination.
>
> Cheers, Jari Oksanen
>
> On 7 Nov 2019, at 22:02, Couch, Claire Elizabeth <couchc using oregonstate.edu>
> wrote:
>
> I am analyzing some microbiome data by using unconstrained ordination (PCA
> or NMDS) followed by environmental vector fitting with the envfit function
> in the vegan package. The output of envfit includes an r2 value for each
> vector or factor included in the envfit model, but I am interested in the
> total amount of variation explained by all the vectors/factors, rather than
> just stand-alone variables. I presume I cannot simply add up the R2 values
> assigned to each environmental variable, because there may be overlap in
> the microbiome variation that is "explained" by each environmental
> variable. However, there does not seem to be any way of accessing the total
> r2 value for the model.
>
> Using an example dataset, this is what I have tried so far:
>
> library(vegan)
> library(MASS)
>
> data(varespec, varechem)
> library(MASS)
> ord <- metaMDS(varespec)
> fit <- envfit(ord, varechem, perm = 999)
> fit
>
> This shows r2 for each environmental variable, but how do I extract the r2
> value for the entire model?
>
> I have tried running fit$r, attributes(fit)$r, and Rsquare.Adj(fit), but
> these all return NULL.
>
> I would greatly appreciate any suggestions!
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>
> --
Claire E. Couch (she/her/hers)
NSF Graduate Fellow
Department of Integrative Biology, Oregon State University
--
*Oregon State University in Corvallis, OR is located within the traditional
homelands of the Mary's River or Ampinefu Band of Kalapuya. Following the
Willamette Valley Treaty of 1855 (Kalapuya etc. Treaty), Kalapuya people
were forcibly removed to reservations in Western Oregon. Today, living
descendants of these people are a part of the Confederated Tribes of Grand
Ronde Community of Oregon (https://www.grandronde.org
<https://www.grandronde.org/>) and the Confederated Tribes of the Siletz
Indians (https://ctsi.nsn.us <https://ctsi.nsn.us/>).*
[[alternative HTML version deleted]]
More information about the R-sig-ecology
mailing list