Re: Boost matches occurring early in the field (offset)

Alexandre Rafalovitch Wed, 29 Aug 2018 13:53:28 -0700

TokenOffsetPayloadTokenFilter ? It is mentioned in
https://www.slideshare.net/lucidworks/payloads-in-solr-erik-hatcher-lucidworks
, but no detailed example seems to be given.


I do see this question from time to time, so a definitive feedback
would be useful for the future.

Regards,
   Alex.

On 29 August 2018 at 16:18, Jan Høydahl <[email protected]> wrote:
> I also tend to use "sentinel tokens" for exact match or to anchor a search. 
> But in order to obtain decaying boost the further down in the article a match 
> is, you'd need to write several such span/slop queries with varying slops, 
> e.g. highest boost for first 10 words, medium boost for first 50 words, low 
> boost for first 150 words, no boost below that.
>
> As I wrote in my initial mail, we can do such workarounds, or play with 
> payloads etc. But my real question is whether/how it is possible to factor 
> the actual term offset information from a matching term into the scoring 
> algorithm? Would you need to implement your own Scorer/Weight impl?
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
>> 29. aug. 2018 kl. 15:37 skrev Doug Turnbull 
>> <[email protected]>:
>>
>> You can also insert a token at the beginning of the query during analysis
>> using a char filter. I call these sort of boundary tokens "sentinel
>> tokens". So a phrase search for "red shoes" becomes "<SENT_BEG> red shoes".
>> You can add some slop to allow for permissible distance (with
>>
>> You can also use the Limit Token Count Token Filter and create a copyField,
>> so if you want to boost on first 10 matches, just limit to 10 tokens then
>> use this as a boost query
>> https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html#FilterDescriptions-LimitTokenCountFilter
>>
>> -Doug
>>
>> On Wed, Aug 29, 2018 at 6:26 AM Mikhail Khludnev <[email protected]> wrote:
>>
>>> <SpanFirst>
>>> <
>>> https://lucene.apache.org/solr/guide/6_6/other-parsers.html#OtherParsers-XMLQueryParser
>>>>
>>>
>>> On Wed, Aug 29, 2018 at 1:19 PM Jan Høydahl <[email protected]> wrote:
>>>
>>>> Hi,
>>>>
>>>> Is there an ootb way to boost term matches based on their position/offset
>>>> inside a field, so that the term gets a higher score if it occurs in the
>>>> befinning of the field and lower boost or a deboost if it occurs towards
>>>> the end of a field?
>>>>
>>>> I know that I could index the first part of the text in a new field and
>>>> boost on that, but that is kind of "binary".
>>>> I could also add the term offset as payload for every term and boost on
>>>> that, but this should not be necessary since offset info is already part
>>> of
>>>> the index?
>>>>
>>>> --
>>>> Jan Høydahl, search solution architect
>>>> Cominvent AS - www.cominvent.com
>>>>
>>>>
>>>
>>> --
>>> Sincerely yours
>>> Mikhail Khludnev
>>>
>> --
>> CTO, OpenSource Connections
>> Author, Relevant Search
>> http://o19s.com/doug
>

Re: Boost matches occurring early in the field (offset)

Reply via email to