Hi Spyros, Thanks for sharing! This is certainly subject for a test, but I think that LTR plugin could be modified to rerank the documents on the merging node. For instance, if instead of solr cloud endpoint, you use a separate solr instance to route and aggregate the federated results, the reranking could happen only once inside that instance.
Another approach with score normalization is mentioned here: https://sease.io/2016/10/apache-solr-learning-to-rank-better-part-4.html On Fri, Aug 28, 2020 at 7:39 PM Spyros Kapnissis <ska...@gmail.com> wrote: > Hi Dmitry, > > No, we were not able to solve the sorting/re-ranking issue. In the end we > migrated the custom sorting formula to using the 'q' param instead of > 'sort' to get back the results sorted by score as expected. > > That mostly solved our issues with inconsistent Solr scores. Maybe sorting > and re-ranking are conflicting concepts. > > Hope this helps. > > > On Fri, Aug 28, 2020 at 4:28 PM Jörn Franke <jornfra...@gmail.com> wrote: > > > Maybe this can help you? > > > > > https://lucene.apache.org/solr/guide/7_5/distributed-requests.html#configuring-statscache-distributed-idf > > > > On Mon, May 11, 2020 at 9:24 AM Spyros Kapnissis <ska...@gmail.com> > wrote: > > > > > HI all, > > > > > > On our current master/slave setup (no cloud), we use a a custom sorting > > > function to get the first pass results (using the sort param), and then > > we > > > use LTR for re-ranking. This works fine, i.e. re-ranking is applied on > > the > > > topN, after sorting has completed and the order is correct. > > > > > > However, as we are migrating on SolrCloud (version 7.3.1) with multiple > > > shards, this does not seem to work as expected. To my understanding, > Solr > > > collects the reranked results from the shards back on a single node to > > > merge them, and then tries to re-apply sorting. > > > > > > We would expect the results to at least follow the sorting formula, > even > > if > > > this is not what we want. But this still not even the case, as the > > > combination of the two (sorting + reranking) results in erratic > ordering. > > > > > > Example result, where $sort_score is the sorting formula output, and > > score > > > is the LTR re-ranked output: > > > > > > {"id": "152", > > > "$sort_score": 17.38543, > > > "score": 0.22140852 > > > }, > > > {"id": "2016", > > > "$sort_score": 14.612957, > > > "score": 0.19214153 > > > }, > > > { "id": "1523", > > > "$sort_score": 14.4093275, > > > "score": 0.26738763 > > > }, > > > { "id": "6704", > > > "$sort_score": 13.956842, > > > "score": 0.17357588 > > > }, > > > { "id": "6512", > > > "$sort_score": 14.43907, > > > "score": 0.11575622 > > > }, > > > > > > We also tried with other simple re-rank queries apart from LTR, and the > > > issue persisted. > > > > > > Could someone please help troubleshoot? Ideally, we would want to have > > the > > > re-rank results merged on the single node, and not re-apply sorting. > > > > > > Thank you! > > > > > > -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: https://semanticanalyzer.info