[R] An ANOVA test that uses a distance matrix like hierarchical cluster analysis?

Gavin Simpson gavin.simpson at ucl.ac.uk
Sun Apr 27 11:37:42 CEST 2008

```Nick,

multivariate analogue of ANOVA that uses a dissimilarity matrix as
input. betadisper() is a multivariate analogue of Levene's test and does
a test for homogeneity of multivariate dispersions, again using a
dissimilarity matrix as input. adonis() compares the group means and

I'd install vegan from the R-forge repository for these functions:

http://r-forge.r-project.org/projects/vegan/

because I fixed a bug in betadisper and adonis has extra functionality
in the current devel version. This has not yet made its way to CRAN.

The help pages for the relevant functions will guide you to references
that discuss the methodology implemented.

HTH

G

On Sun, 2008-04-27 at 15:24 +0800, Nick Flyger wrote:
> Hi All,
>
> I have a question which does not pertain directly to the use of R but comes
> from my use of R!
>
> I have data which can be described as 3-dimensional e.g. (x,y,z), with no
> negative component. The suggested way to analyze this data is via
> multivariate techniques or by calculating what amounts to a levene's test on
> the data and then an ANOVA on the three components if the first test is
> significant or a t-test when only two groups are involved.
>
> I do not like either of the first methods because of the case of 3 or more
> groups. As an example, if I had three groups each with mean distance of 5
> from the origin (0, 0, 0) and a variance of 1 about that mean. Now say group
> A has a mean for the 3 components of (5, 0, 0), B a  mean of  (0, 5, 0) and
> C a mean of (0, 0, 5). In this case the ANOVA will find no difference
> between the groups because the mean difference and variances are identical.
> Yet we clearly see the groups are different. The t-test is valid because I
> can adjust the formula to accept the euclidean difference between the mean
> scores of two groups.
>
> As an alternative I like to use hierarchical cluster analysis with the
> euclidean distance matrix and bootstrapping for p-values. In this way I
> don't have to prematurely collapse the data to a single value per
> observation and the distance matrix allows for direct distance comparison
> between all the observations for all the groups similar to the t-test.
>
> However, prior to analysis I know the groups that the observations belong to
> and I would like to use ANOVA and post hoc tests to tease out the
> differences. Is there a ANOVA style of analysis that makes use of distance
> matrices?
>
> Any thoughts would be appreciated as statistics is not my specialty although
> I am happy with programming in R, Matlab etc...
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

```