TokenOffsetPayloadTokenFilter ? It is mentioned in https://www.slideshare.net/lucidworks/payloads-in-solr-erik-hatcher-lucidworks , but no detailed example seems to be given.
I do see this question from time to time, so a definitive feedback would be useful for the future. Regards, Alex. On 29 August 2018 at 16:18, Jan Høydahl <jan....@cominvent.com> wrote: > I also tend to use "sentinel tokens" for exact match or to anchor a search. > But in order to obtain decaying boost the further down in the article a match > is, you'd need to write several such span/slop queries with varying slops, > e.g. highest boost for first 10 words, medium boost for first 50 words, low > boost for first 150 words, no boost below that. > > As I wrote in my initial mail, we can do such workarounds, or play with > payloads etc. But my real question is whether/how it is possible to factor > the actual term offset information from a matching term into the scoring > algorithm? Would you need to implement your own Scorer/Weight impl? > > -- > Jan Høydahl, search solution architect > Cominvent AS - www.cominvent.com > >> 29. aug. 2018 kl. 15:37 skrev Doug Turnbull >> <dturnb...@opensourceconnections.com>: >> >> You can also insert a token at the beginning of the query during analysis >> using a char filter. I call these sort of boundary tokens "sentinel >> tokens". So a phrase search for "red shoes" becomes "<SENT_BEG> red shoes". >> You can add some slop to allow for permissible distance (with >> >> You can also use the Limit Token Count Token Filter and create a copyField, >> so if you want to boost on first 10 matches, just limit to 10 tokens then >> use this as a boost query >> https://lucene.apache.org/solr/guide/6_6/filter-descriptions.html#FilterDescriptions-LimitTokenCountFilter >> >> -Doug >> >> On Wed, Aug 29, 2018 at 6:26 AM Mikhail Khludnev <m...@apache.org> wrote: >> >>> <SpanFirst> >>> < >>> https://lucene.apache.org/solr/guide/6_6/other-parsers.html#OtherParsers-XMLQueryParser >>>> >>> >>> On Wed, Aug 29, 2018 at 1:19 PM Jan Høydahl <jan....@cominvent.com> wrote: >>> >>>> Hi, >>>> >>>> Is there an ootb way to boost term matches based on their position/offset >>>> inside a field, so that the term gets a higher score if it occurs in the >>>> befinning of the field and lower boost or a deboost if it occurs towards >>>> the end of a field? >>>> >>>> I know that I could index the first part of the text in a new field and >>>> boost on that, but that is kind of "binary". >>>> I could also add the term offset as payload for every term and boost on >>>> that, but this should not be necessary since offset info is already part >>> of >>>> the index? >>>> >>>> -- >>>> Jan Høydahl, search solution architect >>>> Cominvent AS - www.cominvent.com >>>> >>>> >>> >>> -- >>> Sincerely yours >>> Mikhail Khludnev >>> >> -- >> CTO, OpenSource Connections >> Author, Relevant Search >> http://o19s.com/doug >