We fetch a large number of documents -- 1000+ -- for each search. Each
request fetches only the uniqueKey or the uniqueKey plus one secondary
integer key. Despite this, we find that we spent a sizable amount of time
in SolrIndexSearcher#doc(int docId, Set<String> fields). Time is spent
fetching the two stored fields, LZ4 decoding, etc.

I would love to be able to tell Solr to always fetch these two fields from
memory. We have them both in the fieldCache so we're already spending the
RAM. I've seen this asked previously [1], so it seems like a fairly common
need, especially for distributed search. Any ideas?

A few possible ideas I had:

--Check FieldCache.html#getCacheEntries() before going to stored fields.
--Give the documentCache config a list of fields it should load from the
fieldCache


Having an in-memory mapping from docId->uniqueKey has come up for us
before. We've used a custom SolrCache maintaining that mapping to quickly
filter over personalized collections. Maybe the uniqueKey should be more
optimized out of the box? Perhaps a custom "uniqueKey" codec that also
maintained the docId->uniqueKey mapping in memory?

--Gregg

[1] http://search-lucene.com/m/oCUKJ1heHUU1

Reply via email to