On 1/14/2016 12:07 PM, Anria B. wrote: > Here are some Actual examples, if it helps > > wt=json&q=*:*&indent=on&fq=SolrDocumentType:"invalidValue"&fl=timestamp&rows=0&start=0&debug=timing <snip> > "QTime": 590, <snip> > Now we wipe out all caches, and put the filter in q. > > wt=json&q=SolrDocumentType:"invalidValue"&indent=on&fl=timestamp&rows=0&start=0&debug=timing <snip> > "QTime": 266,
For uncached queries on an index with 20+ million documents that takes up 121GB of disk space, these are pretty good times. When the query is not cached, a filter query will *always* be slower than the same thing in the q parameter. The reason for this is very simple -- the *result* of a filter query is a bitset where every document in the index is represented, with a zero for no match and a one for a match. Solr must touch every single document in the index (including deleted documents) to build this bitset. The bitset for a 20 million document index is 2.5 million bytes long. This bitset is what gets put into the filterCache. When a query is in the q parameter, there are shortcuts in Lucene that Solr uses to find *only* the number of results requested in the rows parameter, so it takes less time. Filter queries are *lightning* fast when they are cached, because Solr does not need to do the work of checking every document in the index to see if it's in the result list. That is the reason that you will commonly see advice to move things from q to fq ... but that advice should only be followed if you expect filters to be re-used often enough to result in cache hits. Thanks, Shawn