Re: Term Frequency Calculation - Clarification

ariya bala Wed, 20 May 2015 05:57:14 -0700

Please ignore.


On Wed, May 20, 2015 at 2:45 PM, ariya bala <ariya...@gmail.com> wrote:

> Thanks Jack.
> In my case there is only one document - Foo Foo is in bar
> As per your comment, I should expect TF to be 2.
> But I am getting one.
> Is there any check where if one match is a subset of other, is calculated
> once?
> My class extends DefaultSimilarity.
>
> Cheers
> Ariya Bala S
>
> On Wed, May 20, 2015 at 2:09 PM, Jack Krupansky <jack.krupan...@gmail.com>
> wrote:
>
>> Yes.
>>
>> tf is both 1 and 2 - tf is per document, which is 1 for the first document
>> and 2 for the second document.
>>
>> See:
>>
>> http://lucene.apache.org/core/5_1_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html
>>
>>
>> -- Jack Krupansky
>>
>> On Wed, May 20, 2015 at 6:13 AM, ariya bala <ariya...@gmail.com> wrote:
>>
>> > Hi,
>> > I have made custom class for scoring the similarity
>> > (TermFrequencyBiasedSimilarity).
>> > The score was deduced by considering just the TF part (acheived  by
>> setting
>> > IDF=1).
>> >
>> > Question is:
>> > -----------------
>> > *Document content:* Foo Foo is in bar
>> > *Search query:* Foo bar
>> > *slop:* 3
>> >
>> > With Slop 3, There are two matches to the query
>> >  Foo is in bar
>> >  Foo Foo is in bar
>> >
>> > *Should the Term Frequency be 1 or 2? Also point to the explanation of
>> the
>> > logic implemented in Lucene/Solr.*
>> >
>> > --
>> > Cheers
>> > *Ariya *
>> >
>>
>
>
>
> --
> *Ariya *
>



-- 
*Ariya *

Re: Term Frequency Calculation - Clarification

Reply via email to