Hi Chris, Good info, thank you for that!
> What's your UI & middle layer like for this application and > eventual "download" ? I'm working in a team on the back-end side of things, where we providing a REST API that can be used by clients, which include our UI, which is a React JS based app with various fancy bio visualisations in it. Slightly more detail, Solr is used purely for search, giving the IDs of the hits. We then use a key-value store to fetch the IDs entity data. So, generally speaking, each "download" involves: 1. user request asking for data in content-type X 2. our REST app makes solr request 3. IDs <- solr fetches results 4. entities <- fetch from key-value store entities with keys in IDs 5. write entities in format X Using cursorMark, 3 & 4 will be performed repeatedly until all hits fetched; and we may run 3 in a separate thread to 4 & 5, to ensure Solr communication need not block fetching entity data / writing. We could do more optimisation around these tasks, but I'm sure you've already understood. Many thanks for your input. Best, Edd On Thu, 3 Oct 2019 at 19:13, Chris Hostetter <hossman_luc...@fucit.org> wrote: > > : We show a table of search results ordered by score (relevancy) that was > : obtained from sending a query to the standard /select handler. We're > : working in the life-sciences domain and it is common for our result sets > to > : contain many millions of results (unfortunately). After users browse > their > : results, they then may want to download the results that they see, to do > : some post-processing. However, to do this, such that the results appear > in > : the order that the user originally saw them, we'd need to be able to > export > : results based on score/relevancy. > > What's your UI & middle layer like for this application and > eventual "download" ? > > I'm going to presume your end user facing app is reading the data from > Solr, buffering it locally while formatting it in some user selected > export format, and then giving the user a download link? > > In which case using a cursor, and making iterative requests to solr from > your app should work just fine... > > > https://lucene.apache.org/solr/guide/8_0/pagination-of-results.html#fetching-a-large-number-of-sorted-results-cursors > > (The added benefit of cursors over /export is that it doesn't require doc > values on every field you return ... which seems like something that you > might care about if you have large (text) fields and an index growing as > fast as you describe yours growing) > > > If you don't have any sort of middle layer application, and you're just > providing a very thin (ie: javascript) based UI in front of solr, > and need a way to stream a full result set from solr that you can give > your end users raw direct access to ... then i think you're out of luck? > > > -Hoss > http://www.lucidworks.com/ >