Cluster thousands of text documents in java

Question

Is there efficient way clustering text documents? I thought about K-Means but it seems to be too time consuming. Can somebody provide me with an efficient method?

Radi · Answer 1 · 2010-12-24T17:17:38.557

1

clustering algorithm depends on your dataset , do you want to write a algorithm in java to cluster your documents ? , you can use weka instead of reinvent the wheel and to try another clustering algorithm on your dataset .

edited Dec 24 '10 at 17:17

answered Dec 24 '10 at 11:01

Radi

6,332
17
59
89

score 1 · Accepted Answer · edited May 23 '17 at 11:47

1

If K-Means actually does the job, and simply seems to be slow, then why not try to make it faster? The method I use is random-pausing.

It's usually the case that there is lots of room for speedup, in code you wouldn't have thought to be a problem, without changing the basic algorithm. Here's an example.

edited May 23 '17 at 11:47

Community

1
1

answered Dec 24 '10 at 16:26

Mike Dunlavey

39,349
13
88
132

Cluster thousands of text documents in java

2 Answers2