[R-sig-eco] simple question about CCA

Jari Oksanen jari.oksanen at oulu.fi
Thu Sep 11 12:44:47 CEST 2014


Simone, 

If you continue your first example and *look* at the result object, you'll see:

> ccatest
Call: cca(formula = dat[8:144] ~ x1 + x2 + x3 + x4 + x5 + x6 + x7, data
= dat)

              Inertia Proportion Rank
Total           13.76       1.00     
Constrained     13.76       1.00   62
Unconstrained    0.00       0.00    0
Inertia is mean squared contingency coefficient 
Some constraints were aliased because they were collinear (redundant)

This shows you, among other things, that the number of constraints is higher than the number of observations: there is no residual variation (Unconstrained Inertia = 0), and "Some constraints were aliased because they were collinear (redundant)". You won't see those aliased constraints, and therefore the last ones are dropped. In this case, the aliased constraints are:

> alias(ccatest, names=TRUE)
[1] "x6SCL" "x6SL"  "x6ZC"  "x6ZCL" "x7"

That is, four levels of x6 and x7. These are not shown. If you change the order of the constraints, some other variables may be among those last four ones that are not shown and cannot be analysed.

You need either more data (more observations) or a more sensible model with fewer constraints. The first way (collect more data) is more heroic, but the second is more clever.

If you look at your data (dat), you see that x5 is a factor with 54 levels and x6 is a factor with 10 levels. You have 63 observations, and these two together with 64 levels are able to completely explain everything and anything in these data: you run out of degrees of freedom.

Sorry for top-posting and bad formatting: this MS Outlook.

cheers, Jari Oksanen


________________________________________
From: r-sig-ecology-bounces at r-project.org <r-sig-ecology-bounces at r-project.org> on behalf of Simone Ruzza <simone.ruzza12 at gmail.com>
Sent: 11 September 2014 13:24
To: r-sig-ecology at r-project.org
Subject: [R-sig-eco] simple question about CCA

Dear all,

apologies for the simplicity of my question, maybe it has been asked
many times, but I am a total novice to CCA. I have performed a CCA
using a series of environmental variables that comprise a mixture of
categorical and non-categorical variables. What I do not understand is
why when I change the order of my variables and I plot the results, a
variable disappears from the CCA biplot i.e. the last one being
continuous variable. I realised that there might a very simple
question, so I would be happy even with a reference where to find an
answer. Below some code showing what is happening.

thanks in advance,

Simone



require(RCurl)
require(vegan)
x <- getURL("https://dl.dropboxusercontent.com/u/33966347/testdata.csv")
dat<- read.csv(text = x)


# example 1 x7 disappear from the plot (note that x5 and x6 are categorical)
ccatest<-cca(dat[8:144]~x1+x2+x3+x4+x5+x6+x7,data=dat)
plot(ccatest)

# example 2 x7 is present in the plot
ccatest1<-cca(dat[8:144]~x7+x1+x2+x3+x4+x5+x6,data=dat)
plot(ccatest1)

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology



More information about the R-sig-ecology mailing list