Thank you so much for your explanation. On 2 June 2015 at 17:31, Alessandro Benedetti <benedetti.ale...@gmail.com> wrote:
> The scope in there is to try to make clustering lighter and more related to > the query. > The summary produced is a fragment that is surrounding the query terms in > the document content. > Actually this is arguably a way to improve the quality of clusters, but for > sure it makes the clustering operation lighter, as the content used to > produce the clusters is much smaller than the full content. > > We can discuss of course if the window of text surrounding queries match is > really helpful to cluster the documents in a more precise way. > That is not an easy research topic, and for sure it depends strictly on the > use cases. > For this reason a user should decide if going with the summary ( lighter) > approach or the more comprehensive , full content approach. > > Cheers > > 2015-06-02 3:21 GMT+01:00 Zheng Lin Edwin Yeo <edwinye...@gmail.com>: > > > Thank you so much Alessandro. > > > > But i do not find any difference with the quality of the clustering > results > > when I change the hl.fragszie to a even though I've set my > > carrot.produceSummary to true. > > > > > > Regards, > > Edwin > > > > > > On 1 June 2015 at 17:31, Alessandro Benedetti < > benedetti.ale...@gmail.com> > > wrote: > > > > > Only to clarify the initial mail, The carrot.fragSize has nothing to do > > > with the number of clusters produced. > > > > > > When you select to work with field summary ( you will work only on > > snippets > > > from the original content, snippets produced by the highlight of the > > query > > > in the content), the fragSize will specify the size of these fragments. > > > > > > From Carrot documentation : > > > > > > carrot.produceSummary > > > > > > When true, the carrot.snippet > > > <https://wiki.apache.org/solr/ClusteringComponent#carrot.snippet> > field > > > (if > > > no snippet field, then the carrot.title > > > <https://wiki.apache.org/solr/ClusteringComponent#carrot.title> field) > > > will > > > be highlighted and the highlighted text will be used for clustering. > > > Highlighting is recommended when the snippet field contains a lot of > > > content. Highlighting can also increase the quality of clustering > because > > > the clustered content will get an additional query-specific context. > > > carrot.fragSize > > > > > > The frag size to use for highlighting. Meaningful only when > > > carrot.produceSummary > > > < > https://wiki.apache.org/solr/ClusteringComponent#carrot.produceSummary> > > > is > > > true. If not specified, the default highlighting fragsize (hl.fragsize) > > > will be used. If that isn't specified, then 100. > > > > > > > > > Cheers > > > > > > 2015-06-01 2:00 GMT+01:00 Zheng Lin Edwin Yeo <edwinye...@gmail.com>: > > > > > > > Thank you Stanislaw for the links. Will read them up to better > > understand > > > > how the algorithm works. > > > > > > > > Regards, > > > > Edwin > > > > > > > > On 29 May 2015 at 17:22, Stanislaw Osinski < > > > > stanislaw.osin...@carrotsearch.com> wrote: > > > > > > > > > Hi, > > > > > > > > > > The number of clusters primarily depends on the parameters of the > > > > specific > > > > > clustering algorithm. If you're using the default Lingo algorithm, > > the > > > > > number of clusters is governed by > > > > > the LingoClusteringAlgorithm.desiredClusterCountBase parameter. > Take > > a > > > > look > > > > > at the documentation ( > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/solr/Result+Clustering#ResultClustering-TweakingAlgorithmSettings > > > > > ) > > > > > for some more details (the "Tweaking at Query-Time" section shows > how > > > to > > > > > pass the specific parameters at request time). A complete overview > of > > > the > > > > > Lingo clustering algorithm parameters is here: > > > > > http://doc.carrot2.org/#section.component.lingo. > > > > > > > > > > Stanislaw > > > > > > > > > > -- > > > > > Stanislaw Osinski, stanislaw.osin...@carrotsearch.com > > > > > http://carrotsearch.com > > > > > > > > > > On Fri, May 29, 2015 at 4:29 AM, Zheng Lin Edwin Yeo < > > > > edwinye...@gmail.com > > > > > > > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > I'm trying to increase the number of cluster result to be shown > > > during > > > > > the > > > > > > search. I tried to set carrot.fragSize=20 but only 15 cluster > > labels > > > is > > > > > > shown. Even when I tried to set carrot.fragSize=5, there's also > 15 > > > > labels > > > > > > shown. > > > > > > > > > > > > Is this the correct way to do this? I understand that setting it > to > > > 20 > > > > > > might not necessary mean 20 lables will be shown, as the setting > is > > > for > > > > > > maximum number. But when I set this to 5, it should reduce the > > number > > > > of > > > > > > labels to 5? > > > > > > > > > > > > I'm using Solr 5.1. > > > > > > > > > > > > > > > > > > Regards, > > > > > > Edwin > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > -------------------------- > > > > > > Benedetti Alessandro > > > Visiting card : http://about.me/alessandro_benedetti > > > > > > "Tyger, tyger burning bright > > > In the forests of the night, > > > What immortal hand or eye > > > Could frame thy fearful symmetry?" > > > > > > William Blake - Songs of Experience -1794 England > > > > > > > > > -- > -------------------------- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England >