Thanks Shawn,

What's Solr equivalence to ConstantScoreQuery? I.e., what if you want to
run a query that does not score, but only filter. The rationale behind
using a non-cached 'fq' was just that.

Shai

On Wed, Jun 24, 2015 at 4:29 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 6/24/2015 5:28 AM, Esther Goldbraich wrote:
> > We are comparing the performance of fq versus q for queries that are
> > actually filters and should not be cached.
> > In part of queries we see strange behavior where q performs 5-10x better
> > than fq. The question is why?
> >
> > An example1:
> > q=maildate:{DATE1 to DATE2} COMPARED TO fq={!cache=false}maildate:{DATE1
> > to DATE2}
> > sort=maildate_sort* desc
>
> <snip>
>
> > <field name="maildate" stored="true" indexed="true" type="tdate"/>
> > <field name="maildate_sort" stored="false" indexed="false" type="tdate"
> > docValues="true"/>
>
> For simplicity, I would probably just use one field for that, rather
> than a separate sort field.  The disk space required would probably be
> the same either way, but your interaction with the index will not be as
> complex.  There's nothing wrong with doing it the way you have, though.
>
> I'm not at all an expert, but I've been a member of this community for a
> long time.  Here's my guess about why your query is faster in the q
> parameter than a non-cached filter:
>
> The result of a standard query is the stored fields from the top N
> documents, where N is the value in the rows parameter.  The default for
> N is typically set to 10, and for most people will normally be 200 or less.
>
> The result of a filter is very different -- it is a bitset of all the
> documents in your entire index, with binary 0 for documents that don't
> match the filter and binary 1 for documents that do match.
>
> If your index has 100 million documents, every single one of those 100
> million documents must be checked against the filter query to produce a
> filter bitset, but when it's in the q parameter, shortcuts can be taken
> which will get the top N results quickly.
>
> The filterCache levels the playing field when filters are re-used.  If a
> requested filter is already in the cache, it can be retrieved and
> applied to a result VERY quickly.
>
> You have turned off the caching for your filter.  I'm not sure why you
> did this, but you know your use case a lot better than I do.  If it were
> me, I would use filter queries and do everything possible to re-use the
> same filters, and I would cache them.
>
> Thanks,
> Shawn
>
>

Reply via email to