[R] how to cluster rows of words in a text file
mail me
mailme842 at googlemail.com
Fri Mar 23 19:03:17 CET 2012
Hi:
I am trying to cluster the rows of a text file with kmeans:
I load the data as follows
file1 <- read.csv("somefile.csv")
and the file can be viewed having the following line of words
> file1
1 word1 word3 word4 word1
2 word1 word4 word3 word1
3 word4 word2 word4 word3
4 word4 word2 word1 word3
5 word2 word2 word4 word2
file_as_matrix <- as.matrix(file1);
Now, I want to apply some clustering algorithm such as kmeans to
cluster the rows in the file to get the following output:
Cluster1
word1 word3 word4 word1
word1 word4 word3 word1
Cluster2
word4 word2 word4 word3
word4 word2 word1 word3
word2 word2 word4 word2
But as kmeans takes as input numeric matrix of data, it cannot be
used to cluster the rows in this case.
Is there any simple way to cluster the rows of such a text file? An
example code would be really useful.
Thanks and regards:
debb
More information about the R-help
mailing list