I have to confess that I know very little about the mechanics of LTR, but
I can talk a little bit about compression.

When a stored values is retrieved for a document it is read from the
*.fdt file which is a compressed, verbatim copy of the field. DocValues
can bypass this stored data and read directly from the DV format.
There's a discussion of useDocValuesAsStored in solr/CHANGES.txt.

The restriction of docValues is that they can only be used for
primitive types, numerics, strings and the like, specifically _not_
fields with class="solr.TextField".

WARNING: I have no real clue whether LTR is built to leverage
docValues fields. If you add docValues="true" to the relevant
fields you'll have to re-index completely. In fact I'd use a new
collection.

And don't be put off by the fact that the index size on disk will grow
on disk if you add docValues, the memory is MMapped to OS
disk space and will actually _reduce_ your JVM requirements.

Best,
Erick



On Thu, Aug 10, 2017 at 6:57 AM, Sebastian Klemke
<sebastian.kle...@researchgate.net> wrote:
> Hi,
>
> we're currently experimenting with LTR reranking on large rerank
> windows (rerankDocs=1000+). On a >500M documents SolrCloud collection,
> we were only able to get sub-second response times with
> FieldValueFeature. Therefore we created a custom feature extractor that
> matches field values with constant strings to substitute simple
> SolrFeature usages. Apparently, the response time is now dominated by
> loading stored fields, more specifically by uncompressing chunks of
> stored field data.
>
> We're now wondering how many documents LTR can rerank in practice and
> what the bottlenecks are. Do you guys have any experience using it?
>
>
> Regards,
>
> Sebastian
>
>
> --
> Sebastian Klemke
> Senior Software Engineer
>
> ResearchGate GmbH
> Invalidenstr. 115, 10115 Berlin, Germany
>
> www.researchgate.net
>
> Registered Seat: Hannover, HR B 202837
> Managing Directors: Dr Ijad Madisch, Dr Sören Hofmayer VAT-ID: DE258434568
> A proud affiliate of: ResearchGate Corporation, 350 Townsend St #754, San 
> Francisco, CA 94107
>

Reply via email to