|
|
|
||||||||||
Abstract: Finding clusters in large datasets is a difficult task. Almost all computationally feasible methods are related to k-means and need a clear partition structure of the data, while most such datasets contain masking outliers and other deviations from the usual models of partitioning clusteranalysis. It is possible to look for clusters informally using graphic tools likethe grand tour, but the meaning and the validity ofsuch patterns is unclear. In this paper, a three-step-approach is suggested:In the first step data visualization methods like the grand tour are usedto find cluster candidate subsets of the data. In the second step,reproducible clusters are generated from them by means of fixed pointclustering, a method to find a single cluster at a timebased on the Mahalanobis distance. In the third step, the validity of the clusters is assessed by use of classification plots.The approach is applied to an astronomical dataset of spectra from theHamburg/ESO survey.
Download: Compressed Postscript (811 Kb) / PDF (193 Kb).
Wichtiger Hinweis:
Diese Website wird in älteren Versionen von Netscape ohne
graphische Elemente dargestellt. Die Funktionalität der
Website ist aber trotzdem gewährleistet. Wenn Sie diese
Website regelmässig benutzen, empfehlen wir Ihnen, auf
Ihrem Computer einen aktuellen Browser zu installieren. Weitere
Informationen finden Sie auf
folgender
Seite.
Important Note:
The content in this site is accessible to any browser or
Internet device, however, some graphics will display correctly
only in the newer versions of Netscape. To get the most out of
our site we suggest you upgrade to a newer browser.
More
information