> I mean measuring the similarity between the document in each cluster. > Also, difference between document on one cluster with another cluster. > > I saw the sample code ClusteringQualityBencmark.java > However, I do not know how to make use of it for assessing my Solr > Clustering performance. >
You'd need to write your own code for this, here are the most common clustering quality measures you mentioned: http://en.wikipedia.org/wiki/Cluster_analysis#Evaluation_of_clustering_results These are meant for the general case (numeric attributes), to apply them to texts, you'd need to use the vector representation of the documents. One a more general note, synthetic measures test only the document-cluster assignments, but none take the quality of labels into account (this is really hard to measure objectively). Staszek