Thanks Shawn, What's Solr equivalence to ConstantScoreQuery? I.e., what if you want to run a query that does not score, but only filter. The rationale behind using a non-cached 'fq' was just that.
Shai On Wed, Jun 24, 2015 at 4:29 PM, Shawn Heisey <apa...@elyograg.org> wrote: > On 6/24/2015 5:28 AM, Esther Goldbraich wrote: > > We are comparing the performance of fq versus q for queries that are > > actually filters and should not be cached. > > In part of queries we see strange behavior where q performs 5-10x better > > than fq. The question is why? > > > > An example1: > > q=maildate:{DATE1 to DATE2} COMPARED TO fq={!cache=false}maildate:{DATE1 > > to DATE2} > > sort=maildate_sort* desc > > <snip> > > > <field name="maildate" stored="true" indexed="true" type="tdate"/> > > <field name="maildate_sort" stored="false" indexed="false" type="tdate" > > docValues="true"/> > > For simplicity, I would probably just use one field for that, rather > than a separate sort field. The disk space required would probably be > the same either way, but your interaction with the index will not be as > complex. There's nothing wrong with doing it the way you have, though. > > I'm not at all an expert, but I've been a member of this community for a > long time. Here's my guess about why your query is faster in the q > parameter than a non-cached filter: > > The result of a standard query is the stored fields from the top N > documents, where N is the value in the rows parameter. The default for > N is typically set to 10, and for most people will normally be 200 or less. > > The result of a filter is very different -- it is a bitset of all the > documents in your entire index, with binary 0 for documents that don't > match the filter and binary 1 for documents that do match. > > If your index has 100 million documents, every single one of those 100 > million documents must be checked against the filter query to produce a > filter bitset, but when it's in the q parameter, shortcuts can be taken > which will get the top N results quickly. > > The filterCache levels the playing field when filters are re-used. If a > requested filter is already in the cache, it can be retrieved and > applied to a result VERY quickly. > > You have turned off the caching for your filter. I'm not sure why you > did this, but you know your use case a lot better than I do. If it were > me, I would use filter queries and do everything possible to re-use the > same filters, and I would cache them. > > Thanks, > Shawn > >