[R] calculating dissimilarities in R
Martin Maechler
maechler at stat.math.ethz.ch
Tue Sep 26 09:55:50 CEST 2006
Hi Elvina,
>>>>> "Elvina" == Elvina Payet <virgin at seychelles.sc>
>>>>> on Tue, 26 Sep 2006 05:48:01 GMT writes:
Elvina> ,A (BDear All,
Elvina> I’ve got a statistical question on calculating
Elvina> dissimilarities in R.
Elvina> I want to calculate the different types of dissimilarities
Elvina> on the ‘flower’ dataset found in the package
Elvina> ‘cluster’. Flower is a data frame with 18 observations
Elvina> on 8 variables. Variable 1 and 2 are binary, variable 3 is
Elvina> asymmetric binary, variable 4 is nominal, variable 5 and 6
Elvina> are ordered and variable 7 and 8 are interval scaled.
Elvina> Commands to load the dataset in R.
> library(cluster)
> data(flower)
or data(flower, package = "cluster")
Elvina> What are the different types of dissimilarities that can be
Elvina> calculated on such a dataset?
Elvina> Do I need to group the types of variables first i.e. all
Elvina> binary together then run the calculation? Do I use
Elvina> dissimilarity indices such as Jaccard or should it be
Elvina> classification function such as ‘daisy’ which should be
Elvina> used?
Yes, you should use daisy() to calculate dissimilarities,
particularly when you are interested in the difference between
symmetric and asymmetric binary.
Do read help(daisy) and look at its examples.
Maybe this will answer all your questions or then it will help
you to ask a much more specific question as suggested by the
posting guide (see link below!)
[.........]
virgin> ______________________________________________
[.........]
virgin> PLEASE do read the posting guide
virgin> http://www.R-project.org/posting-guide.html
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
virgin> and provide commented, minimal, self-contained, reproducible code.
Regards,
Martin Maechler, ETH Zurich
More information about the R-help
mailing list