Hi Spyros,

Thanks for sharing! This is certainly subject for a test, but I think that
LTR plugin could be modified to rerank the documents on the merging node.
For instance, if instead of solr cloud endpoint, you use a separate solr
instance to route and aggregate the federated results, the reranking could
happen only once inside that instance.

Another approach with score normalization is mentioned here:
https://sease.io/2016/10/apache-solr-learning-to-rank-better-part-4.html

On Fri, Aug 28, 2020 at 7:39 PM Spyros Kapnissis <ska...@gmail.com> wrote:

> Hi Dmitry,
>
> No, we were not able to solve the sorting/re-ranking issue. In the end we
> migrated the custom sorting formula to using the 'q' param instead of
> 'sort' to get back the results sorted by score as expected.
>
> That mostly solved our issues with inconsistent Solr scores. Maybe sorting
> and re-ranking are conflicting concepts.
>
> Hope this helps.
>
>
> On Fri, Aug 28, 2020 at 4:28 PM Jörn Franke <jornfra...@gmail.com> wrote:
>
> > Maybe this can help you?
> >
> >
> https://lucene.apache.org/solr/guide/7_5/distributed-requests.html#configuring-statscache-distributed-idf
> >
> > On Mon, May 11, 2020 at 9:24 AM Spyros Kapnissis <ska...@gmail.com>
> wrote:
> >
> > > HI all,
> > >
> > > On our current master/slave setup (no cloud), we use a a custom sorting
> > > function to get the first pass results (using the sort param), and then
> > we
> > > use LTR for re-ranking. This works fine, i.e. re-ranking is applied on
> > the
> > > topN, after sorting has completed and the order is correct.
> > >
> > > However, as we are migrating on SolrCloud (version 7.3.1) with multiple
> > > shards, this does not seem to work as expected. To my understanding,
> Solr
> > > collects the reranked results from the shards back on a single node to
> > > merge them, and then tries to re-apply sorting.
> > >
> > > We would expect the results to at least follow the sorting formula,
> even
> > if
> > > this is not what we want. But this still not even the case, as the
> > > combination of the two (sorting + reranking) results in erratic
> ordering.
> > >
> > > Example result, where $sort_score is the sorting formula output, and
> > score
> > > is the LTR re-ranked output:
> > >
> > > {"id": "152",
> > > "$sort_score": 17.38543,
> > > "score": 0.22140852
> > > },
> > > {"id": "2016",
> > > "$sort_score": 14.612957,
> > > "score": 0.19214153
> > > },
> > > { "id": "1523",
> > > "$sort_score": 14.4093275,
> > > "score": 0.26738763
> > > },
> > > { "id": "6704",
> > > "$sort_score": 13.956842,
> > > "score": 0.17357588
> > > },
> > > { "id": "6512",
> > > "$sort_score": 14.43907,
> > > "score": 0.11575622
> > > },
> > >
> > > We also tried with other simple re-rank queries apart from LTR, and the
> > > issue persisted.
> > >
> > > Could someone please help troubleshoot? Ideally, we would want to have
> > the
> > > re-rank results merged on the single node, and not re-apply sorting.
> > >
> > > Thank you!
> > >
> >
>


-- 
Dmitry Kan
Luke Toolbox: http://github.com/DmitryKey/luke
Blog: http://dmitrykan.blogspot.com
Twitter: http://twitter.com/dmitrykan
SemanticAnalyzer: https://semanticanalyzer.info

Reply via email to