Hi. We're LTR and after switching to multiple shards we found that rerank happens on individual shards and during the merge phase the first pass score isn't used. Currently our LTR model doesn't use textual match and assumes that reranked documents are already more or less good in terms of textual score, which is not always the case when documents are distributed across shards.
To avoid it I've tried to use sort by function that replicates actual query and results I get is somewhat interesting - on individual shards first pass happens by my sorting, then documents are reranked and during the merge documents from the same shard are compared by "orderInShard" and from different shards by value from sort, so that final order is neither sort value nor score. For example let's assume that documents coming from shard 1 are: doc1(first_pass_score = 1, second_pass_score = 2) doc2(first_pass_score = 4, second_pass_score = 1) and documents coming from shard 2 are: doc4(first_pass_score = 3, second_pass_score = 4) doc3(first_pass_score = 2, second_pass_score = 3) where first_pass_score is doc.sort_values[0] and second_pass_score is doc.score when we try to merge all documents this will happen queue.insertWithOverflow(doc1) queue.insertWithOverflow(doc2) queue.lessThan(doc1, doc2) -> false (doc1.orderInShard = 1 < doc2.orderInShard = 2) queue.insertWithOverflow(doc4) queue.lessThan(doc2, doc4) -> false (doc2.first_pass_score = 4 > doc2.first_pass_score = 3) queue.insertWithOverflow(doc3) queue.lessThan(doc4, doc3) -> false (doc4.orderInShard = 1 < doc3.orderInShard = 2) and final documents result will be: doc1(first_pass_score = 1, second_pass_score = 2) doc2(first_pass_score = 4, second_pass_score = 1) doc4(first_pass_score = 3, second_pass_score = 4) doc3(first_pass_score = 2, second_pass_score = 3) Ideally I would want to see rerank happening based on global order across all shards, I've implemented custom component that asks shards to return *Math.max(reRankDocs, offset + rows)* documents, which are first sorted by first pass score and then only top *reRankDocs *are sorted by second pass score. I understand that it might not be the best way in terms of performance (we rerank only top 60 documents so it's not that big of a deal), but it's functionally equivalent to the single shard behavior. I'm curious if current behavior is intended or not, typically I would expect either something I described above or at least ignoring sort during the merge and using only doc.score that was generated by LTR rescorer. Maybe the community would be interested in the approach I've implemented? Or is it considered bad design to rely on first pass score and our LTR model should use fields from first pass / use OriginalScoreFeature?