First, thanks for taking the time to ask a question with enough supporting details that I can hope to be able to answer in one exchange ;). It’s a pleasure to see.
Second, NP with asking on Stack Overflow, they have some excellent answers there. But you’re right, this list gets more Solr-centered eyeballs. On to your question. I think the best answer was that “/export wasn’t designed to deal with scores”, which you’ll find disappointing. You could use the Streaming “search” expression (using qt=/select or just leave qt out) but that’ll sort all of the docs you’re exporting into a huge list, which may perform worse than CursorMark even if it doesn’t blow up memory. The root of this problem is that export can sort in batches since the values it’s sorting on are contained in each document, so it can iterate in batches, send them out, then iterate again on the remaining documents. Score, since it’s dynamic, can’t do that. Solr has to score _all_ the docs to know where a doc lands in the final set relative to any other doc, so if it were going to work it’d have to have enough memory to hold the scores of all the docs in an ordered list, which is very expensive. Conceptually this is an ordered list up to maxDoc long. Not only does there have to be enough memory to hold the entire list, every doc has to be inserted individually which can kill performance. This is the “deep paging” problem. In the usual case of returning, say, 20 docs, the sorted list only has to be 20 long, higher scoring docs evict lower scoring docs. So I think CursorMark is your best bet. Best, Erick > On Oct 1, 2019, at 3:59 AM, Edward Turner <eddtur...@gmail.com> wrote: > > Hi all, > > As far as I understand, SolrCloud currently does not allow the use of > sorting by the pseudofield, score in the /export request handler (i.e., get > the results in relevancy order). If we do attempt this, we get an > exception, "org.apache.solr.search.SyntaxError: Scoring is not currently > supported with xsort". We could use Solr's cursorMark, but this takes a > very long time ... > > Exporting results does work, however, when exporting result sets by a > specific document field that has docValues set to true. > > Question: > Does anyone know if/when it will be possible to sort by score in the > /export handler? > > Research on the problem: > We've seen https://issues.apache.org/jira/browse/SOLR-5244 and > https://issues.apache.org/jira/browse/SOLR-8664, which are related to this > issue, but don't fix it. Maybe I've missed a more relevant issue? > > Our use-case We are using Solrcloud in our team and it's added a huge > amount of value to our users. > > We show a table of search results ordered by score (relevancy) that was > obtained from sending a query to the standard /select handler. We're > working in the life-sciences domain and it is common for our result sets to > contain many millions of results (unfortunately). After users browse their > results, they then may want to download the results that they see, to do > some post-processing. However, to do this, such that the results appear in > the order that the user originally saw them, we'd need to be able to export > results based on score/relevancy. > > Any suggestions or advice on this would be greatly appreciated! > > Many thanks! > > Edd > > PS. apologies for posting also on Stackoverflow ( > https://stackoverflow.com/questions/58167152/solrcloud-export-all-results-sorted-by-score) > -- > I only discovered the Solr mailing-list afterwards and thought it probably > better to reach out directly to Solr's people (I can share any answer from > this forum on there retrospectively).