[R-sig-eco] metaMDS: how does stress affect ordination distance?
Jari Oksanen
jari.oksanen at oulu.fi
Tue Jan 6 09:51:09 CET 2009
Quoting Kim Milferstedt <milferst at uiuc.edu>:
> Hello,
>
> I am using metaMDS in vegan 1.13-1 on R 2.6.1 for ordinating microbial
> sequence data.
>
> If've got three general question about nmds using metaMDS:
>
> 1) Is it fair to assess the range of the x and y axes in nmds for
> comparable data with similar ranges of observed distance?
>
The scaling of NMDS is undefined, and in general you cannot compare
axis scaling across ordinations. Indeed, you can multiply any scores
by any constant and it doesn't change the configuration nor the
solution. However, metaMDS implements Minchin's (DECODA) half-change
scaling that fixes the scale: one unit means halving of similarity
from the replicate similarity. Making some assumptions, you can
compare the scale. The replicate similarity is one key concept here:
you must assume that the solutions are comparable in that sense as
well. Replicate similarity is the estimated dissimilarity among points
at zero distance in ordination, or an estimate of dissimilarity of
replicate samples form the same community, but found from the
dissimilarity--distance plot of NMDS.
> 2) What effect does Kruskal's stress have on the scaling in metaMDS's
> analysis?
None.
>
> 3) Is Kruskal's stress multiplied by a factor of 100 in metaMDS as well,
> as metaMDS relies on isoMDS (see R-mailing list archive under ``isoMDS
> - high stress value and strange configuration'')?
>
metaMDS uses isoMDS and the same stress as isoMDS. That is, a "percent
stress" multiplied with 100.
> Here's a description of my situation: My sequences come from 12 samples.
> Depending on their level of sequence similarity, I group them into 300
> to 16 groups (300 unique sequence types to 16 sequence types that allow
> sequences to be 20% different). For all the groupings, the overall
> observed distance in the data remains quite similar.
>
> I now want to see at what level of similarity, samples start coalescing
> in an ordination plot. For this I use metaMDS for various levels of
> similarity. I assume that I can see the samples coalesce by observing
> the range of the x and y axes shrink in the nmds plot (i.e. the
> ordination distance).
>
> As I expected, in general, the range of the x and y axes of the nmds
> plot is decreasing the less stringent I group sequences together. But
> there's one exception that puzzles me: One plot has vastly different
> ranges for the x and y axes than the other plots (200 times wider than
> for all the others).
You may inspect the scaling plot by calling metaMDS with argument plot
= TRUE which plots the half-change scaling regression.
If you really have only 12 points, you may be stretching some
underlying logic beyond its breaking point.
>
> I noticed that for the exceptional grouping, the calculated Kruskal's
> stress was about three orders of magnitude smaller than for all the
> others, even though the raw data fed into metaMDS looks very much like
> its neighboring groupings. What is happening at this one very different
> analysis?
>
Three order of magnitude is quite a lot for a value that is bound to
be between 0 and 100, when values below one surely are artefacts: no
stress but complete mapping. Possibly you don't have so many points
that NMDS is wortwhile. You can always map three points in a plane
with lighter machinery than NMDS:
> I have not posted any sample data as it is a rather large amount of
> data. I tried producing a smaller dummy sample but those data did not
> reproduce the effect.
>
> Thanks for helping me out!
>
Cheers, jari
More information about the R-sig-ecology
mailing list