On 8/18/2015 7:21 AM, Norgorn wrote: > SOLR version - 4.10.3 > We have SOLR Cloud cluster, each node has documents only for several > categories. > Queries look like "...fq=cat(1 3 89 ...)&..." > So, only some nodes need to process, others can answer with zero as soon as > they check "cat". > > The problem is to keep separate cache for "cat" values on each node. > > As I understand, custom caches are available only for custom request > handlers, but we are happy with default SearchHandler.
I'm curious why you need to make any changes at all. Unless the number of unique values in the cat field is extremely large, a query for a nonexistent term will normally be extremely fast. In the example you provided in a later message on this thread, you would save 200 milliseconds on the entire query,so the 1300 milliseconds of the next longest query would dominate your query time. Although the percentage is significant, this barely registers in human time perception. Based on the numbers you provided, which are fairly similar for all nodes whether there are matches or not, I am thinking that this field does NOT have a huge number of unique values. I think that a qtime of over one second per node for a simple search on a category field indicates a major performance problem. For comparison purposes on one of my own indexes (not SolrCloud, but still distributed), I do a query of "ip:get", and I see a QTime of 552 milliseconds. Subsequent cached queries for the same information happen in about 3 milliseconds. This query matches over 100 million docs -- 104073614, which is nearly half of the 224214642 docs in the entire index. The whole index (split between seven shards on two machines) takes up over 250GB of disk space. It is not a small index. There are 35 unique values in the ip field. I do not have enough memory on these servers for optimal performance ... if I could put 128GB or more of RAM on each server instead of the 64GB that's there now, my query time likely be even faster. Thanks, Shawn