Re: Boost matches occurring early in the field (offset)

Ere Maijala Mon, 17 Sep 2018 19:36:15 -0700

The original question is interesting and also I'd like to boost termswith lower positions, but if it's possible with the payload stuff, theslides and the article athttps://lucidworks.com/2017/09/14/solr-payloads/ left me completelyconfused. A simple complete example would be so great.


Regards,
Ere


Alexandre Rafalovitch kirjoitti 29.8.2018 klo 23.51:

TokenOffsetPayloadTokenFilter ? It is mentioned in
https://www.slideshare.net/lucidworks/payloads-in-solr-erik-hatcher-lucidworks
, but no detailed example seems to be given.

I do see this question from time to time, so a definitive feedback
would be useful for the future.

Regards,
    Alex.

On 29 August 2018 at 16:18, Jan Høydahl <jan....@cominvent.com> wrote:

I also tend to use "sentinel tokens" for exact match or to anchor a search. But 
in order to obtain decaying boost the further down in the article a match is, you'd need 
to write several such span/slop queries with varying slops, e.g. highest boost for first 
10 words, medium boost for first 50 words, low boost for first 150 words, no boost below 
that.

As I wrote in my initial mail, we can do such workarounds, or play with 
payloads etc. But my real question is whether/how it is possible to factor the 
actual term offset information from a matching term into the scoring algorithm? 
Would you need to implement your own Scorer/Weight impl?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

29. aug. 2018 kl. 15:37 skrev Doug Turnbull 
<dturnb...@opensourceconnections.com>:

You can also insert a token at the beginning of the query during analysis
using a char filter. I call these sort of boundary tokens "sentinel
tokens". So a phrase search for "red shoes" becomes "<SENT_BEG> red shoes".
You can add some slop to allow for permissible distance (with

You can also use the Limit Token Count Token Filter and create a copyField,
so if you want to boost on first 10 matches, just limit to 10 tokens then
use this as a boost query
https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html#FilterDescriptions-LimitTokenCountFilter

-Doug

On Wed, Aug 29, 2018 at 6:26 AM Mikhail Khludnev <m...@apache.org> wrote:

<SpanFirst>
<
https://lucene.apache.org/solr/guide/6_6/other-parsers.html#OtherParsers-XMLQueryParser


On Wed, Aug 29, 2018 at 1:19 PM Jan Høydahl <jan....@cominvent.com> wrote:

Hi,

Is there an ootb way to boost term matches based on their position/offset
inside a field, so that the term gets a higher score if it occurs in the
befinning of the field and lower boost or a deboost if it occurs towards
the end of a field?

I know that I could index the first part of the text in a new field and
boost on that, but that is kind of "binary".
I could also add the term offset as payload for every term and boost on
that, but this should not be necessary since offset info is already part

of

the index?

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com


--
Sincerely yours
Mikhail Khludnev

--
CTO, OpenSource Connections
Author, Relevant Search
http://o19s.com/doug


--
Ere Maijala
Kansalliskirjasto / The National Library of Finland

Re: Boost matches occurring early in the field (offset)

Reply via email to