[R] Clustering methods for data that has bimodal distribution
Adrian Johnson
oriolebaltimore at gmail.com
Mon Dec 5 04:52:33 CET 2016
Dear group,
pardon me for a naive question. I have data matrix (11K rows , 4K columns).
The data range is between -1 to 1. Not strictly integers, but real
numbers with at least place values in millionths.
The data distribution is peculiar (if I do plot(density(myMatrix)), I
get nice bimodal curve (nice standard distribution between -1 and 0
and another curve between 0 and 1) .
I am interested in clustering the data (using conesnsus clustering
(that uses K-means)).
My question are:
1. If my data is range is between -1 and 1. Is K-means appropriate
method. considering if the data might have ties.
2. Although K-means is non-parametric, would a bimodal distributed
data be okay as input to K-means.
I appreciate any suggestion.
Thanks
Adrian.
More information about the R-help
mailing list