[R] question about capscale (vegan)

Fri Nov 17 12:18:27 CET 2006

Hello,

Thank you for your help. I have tried to perform the analysis I wanted
with data of example, I mean not real data because I can't provide it
here. So, what I have tried is this,

> matrix
     [,1] [,2] [,3]
[1,] 0.00 0.13 0.59
[2,] 0.13 0.00 0.55
[3,] 0.59 0.55 0.00

> dist_mat
     1    2
2 0.13
3 0.59 0.55

# here, distance matrix is calculated from percentaje of different
nucleic acids between two sequences and R is not used to perform it. The
original data would be like this:

	n1	n2	n3	n4	n5	n6	n7	n8	n9	n10	n11	n12
m1	A	C	G	T	A	G	C	T	A	C	T	A
m2	G	C	T	A	T	G	C	T	A	C	T	A
m3	G	A	G	T	A	G	C	T	A	C	T	A

> factors_frame
  time region    city
1 2006 europe  london
2 2005 africa nairobi
3 2005 europe   paris

> my.cap <- capscale(dist_mat ~ time + region + time:region +
region:city + time:region:city, factors_frame)

> my.cap

Call:
capscale(formula = dist_mat ~ time + region + time:region + region:city
+      time:region:city, data = factors_frame)

            Inertia Rank
Total         0.445
Constrained   0.445    2
Inertia is squared  distance
Some constraints were aliased because they were collinear (redundant)

Eigenvalues for constrained axes:
   CAP1    CAP2
0.42978 0.01522

> anova(my.cap)
Erro en `names<-.default`(`*tmp*`, value = "Residual") :
        se intenta especificar un atributo en un NULL

Then, I am still concerned about 'comm' argument since I don't
understand how important could it be for my type of data and I don't
understand to what it referes in my data. Another thing, is that what I
am really interested in is to perform a factorial anova with another
factor nested (the model I have provided above), and as you can see R
gives an error that I don't understand either.

Thank you for your help in advance. 
Regards,
Alicia

> On Thu, 2006-11-16 at 17:25 +0100, Alicia Amadoz wrote:
> > Hello,
> > 
> > I am interested in using the capscale function of vegan package of R. I
> > already have a dissimilarity matrix and I am intended to use it as
> > 'distance' argument. But then, I don't know what kind of data must be in
> > 'comm' argument. I don't understand what type of data must be referred
> > as 'species scores' and 'community data frame' since my data refer to
> > nucleic distances between different sequences.
> 
> No, that is all wrong. Read ?capscale more closely! It says that you
> need to use the formula to describe the model. "distance" is used to
> tell capscale which distance coefficient to use if the LHS of the model
> formula is a community matrix.
> 
> Argument "comm" is used to tell capscale where to find the species
> matrix that will be used to determine species scores in the analysis,
> *if* the LHS of the formula is a distance matrix. "comm" isn't used if
> the LHS is a data frame, and "distance" is ignored if the LHS is a
> distance matrix.
> 
> As you don't provide a reproducible example of your problem, I will use
> the inbuilt example from ?capscale
> 
> ## load some data
> data(varespec)
> data(varechem)
> 
> Now if you want to fit a capscale model using the raw species data, then
> you would describe the model as so:
> 
> vare.cap <- capscale(varespec ~ N + P + K + Condition(Al), 
>                      data = varechem,
>                      distance = "bray")
> vare.cap
> 
> In the above, LHS of formula is a data frame so capscale looks to
> argument "distance" for the name of the coefficient to turn it into a
> distance matrix. The terms on the RHS of the formula are variables
> looked up in the object assigned to the "data" argument.
> 
> Now lets alter this to start with a dissimilarity/distance matrix
> instead. The exact complement of the above would be:
> 
> dist.mat <- vegdist(varespec, method = "bray")
> vare.cap2 <- capscale(dist.mat ~ N + P + K + Condition(Al), 
>                      data = varechem,
>                      comm = varespec)
> vare.cap2
> 
> To explain the above example; first create the Bray Curtis distance
> matrix (dist.mat). Then use this on the LHS of the formula. When
> capscale now wants to calculate the species scores of the analysis it
> will look to argument "comm" to use in the calculation; which in this
> case we specify is the original species matrix varespec.
> 
> As for what are species scores, well this is a throw back to the origins
> of the package and the methods included - all of this is related to
> ecology and mainly vegetation analysis (hence vegan).
> 
> For species scores, read variable scores. The distance matrix (however
> calculated) describes how similar your individual sites (read samples)
> are to one another. You can also display information about the variables
> used to determine those distances/similarities, and this is what is
> meant by species scores. Whatever you used to generate the distance
> matrix, the columns represent the info used to generate the "species
> scores".
> 
> If some of this still isn't clear, email the list with the commands used
> to generate your distance matrix in R and I'll have a go at explaining
> this with reference to your data/example.
> 
> > 
> > I would be very grateful if you could help me with this fact in any
> > manner. Thank you in advance for your help.
> > 
> > Regards,
> > Alicia
> 
> HTH
> 
> G
> 
> -- 
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
>  Gavin Simpson                 [t] +44 (0)20 7679 0522
>  ECRC & ENSIS, UCL Geography,  [f] +44 (0)20 7679 0565
>  Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
>  Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
>  UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
> %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> 
> 
>