Re: Term Frequency Calculation - Clarification

2015-05-20 Thread ariya bala
Please ignore. On Wed, May 20, 2015 at 2:45 PM, ariya bala wrote: > Thanks Jack. > In my case there is only one document - Foo Foo is in bar > As per your comment, I should expect TF to be 2. > But I am getting one. > Is there any check where if one match is a subset of other, is calculated > o

Re: Term Frequency Calculation - Clarification

2015-05-20 Thread ariya bala
Thanks Jack. In my case there is only one document - Foo Foo is in bar As per your comment, I should expect TF to be 2. But I am getting one. Is there any check where if one match is a subset of other, is calculated once? My class extends DefaultSimilarity. Cheers Ariya Bala S On Wed, May 20, 201

Re: Term Frequency Calculation - Clarification

2015-05-20 Thread Jack Krupansky
Yes. tf is both 1 and 2 - tf is per document, which is 1 for the first document and 2 for the second document. See: http://lucene.apache.org/core/5_1_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html -- Jack Krupansky On Wed, May 20, 2015 at 6:13 AM, ariya bala wrote: > Hi, >

Term Frequency Calculation - Clarification

2015-05-20 Thread ariya bala
Hi, I have made custom class for scoring the similarity (TermFrequencyBiasedSimilarity). The score was deduced by considering just the TF part (acheived by setting IDF=1). Question is: - *Document content:* Foo Foo is in bar *Search query:* Foo bar *slop:* 3 With Slop 3, There ar