Hi Chris,

Good info, thank you for that!

> What's your UI & middle layer like for this application and
> eventual "download" ?

I'm working in a team on the back-end side of things, where we providing a
REST API that can be used by clients, which include our UI, which is a
React JS based app with various fancy bio visualisations in it. Slightly
more detail, Solr is used purely for search, giving the IDs of the hits. We
then use a key-value store to fetch the IDs entity data. So, generally
speaking, each "download" involves:

1. user request asking for data in content-type X
2. our REST app makes solr request
3. IDs <- solr fetches results
4. entities <- fetch from key-value store entities with keys in IDs
5. write entities in format X

Using cursorMark, 3 & 4 will be performed repeatedly until all hits
fetched; and we may run 3 in a separate thread to 4 & 5, to ensure Solr
communication need not block fetching entity data / writing. We could do
more optimisation around these tasks, but I'm sure you've already
understood.

Many thanks for your input.

Best,
Edd

On Thu, 3 Oct 2019 at 19:13, Chris Hostetter <hossman_luc...@fucit.org>
wrote:

>
> : We show a table of search results ordered by score (relevancy) that was
> : obtained from sending a query to the standard /select handler. We're
> : working in the life-sciences domain and it is common for our result sets
> to
> : contain many millions of results (unfortunately). After users browse
> their
> : results, they then may want to download the results that they see, to do
> : some post-processing. However, to do this, such that the results appear
> in
> : the order that the user originally saw them, we'd need to be able to
> export
> : results based on score/relevancy.
>
> What's your UI & middle layer like for this application and
> eventual "download" ?
>
> I'm going to presume your end user facing app is reading the data from
> Solr, buffering it locally while formatting it in some user selected
> export format, and then giving the user a download link?
>
> In which case using a cursor, and making iterative requests to solr from
> your app should work just fine...
>
>
> https://lucene.apache.org/solr/guide/8_0/pagination-of-results.html#fetching-a-large-number-of-sorted-results-cursors
>
> (The added benefit of cursors over /export is that it doesn't require doc
> values on every field you return ... which seems like something that you
> might care about if you have large (text) fields and an index growing as
> fast as you describe yours growing)
>
>
> If you don't have any sort of middle layer application, and you're just
> providing a very thin (ie: javascript) based UI in front of solr,
> and need a way to stream a full result set from solr that you can give
> your end users raw direct access to ... then i think you're out of luck?
>
>
> -Hoss
> http://www.lucidworks.com/
>

Reply via email to