[R] Clustering Categorial and Continuous Variables
Wolski
wolski at molgen.mpg.de
Thu Jun 10 13:26:36 CEST 2004
Hi!
You need a apropriate dissimilarity measure.
look for daisy in package cluster
help("daisy",package="cluster")
x: numeric matrix or data frame. Dissimilarities will be
computed between the rows of 'x'. Columns of mode 'numeric'
(i.e. all columns when 'x' is a matrix) will be recognized as
interval scaled variables, columns of class 'factor' will be
recognized as nominal variables, and columns of class
'ordered' will be recognized as ordinal variables. Other
variable types should be specified with the 'type' argument.
Missing values ('NA's) are allowed.
...
Fore example Gower 1971 proposed a coefficient for variables of different type(?) categorial continous binary.
sincerely
Eryk
*********** REPLY SEPARATOR ***********
On 6/10/2004 at 11:52 AM Wayne Jones wrote:
>>>Hi there fellow R users,
>>>
>>>R has many different clustering packages (e.g. mclust,cluster,e1071).
>>>
>>>However, can anyone recommend a method to deal with data sets that
>>>contain
>>>categorial and continuous variables?
>>>
>>>Regards
>>>
>>>Wayne
>>>
>>>
>>>
>>>KSS Ltd
>>>Seventh Floor St James's Buildings 79 Oxford Street Manchester M1
>>>6SS England
>>>Company Registration Number 2800886
>>>Tel: +44 (0) 161 228 0040 Fax: +44 (0) 161 236 6305
>>>mailto:kssg at kssg.com http://www.kssg.com
>>>
>>>
>>>The information in this Internet email is confidential and
>>>m...{{dropped}}
>>>
>>>______________________________________________
>>>R-help at stat.math.ethz.ch mailing list
>>>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>>>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Dipl. bio-chem. Eryk Witold Wolski @ MPI-Moleculare Genetic
Ihnestrasse 63-73 14195 Berlin 'v'
tel: 0049-30-83875219 / \
mail: wolski at molgen.mpg.de ---W-W---- http://www.molgen.mpg.de/~wolski
More information about the R-help
mailing list