Re: Multiple fq vs combined fq performance

Tomás Fernández Löbbe Fri, 10 Jul 2020 10:49:46 -0700

All non-cached filters will be executed together (leapfrog between them)
and will be sorted by the filter cost (I guess that, since you aren't
setting a cost, then the order of the input matters).  You can try setting
a cost in your filters (lower than 100, so that they don't become post
filters)


One other thing though, I guess you are using Point fields? If you
typically query for a single value like in this example (vs. ranges), you
may want to use string fields for those. See
https://issues.apache.org/jira/browse/SOLR-11078.




On Fri, Jul 10, 2020 at 7:51 AM Chris Dempsey <cdal...@gmail.com> wrote:

> Thanks for the suggestion, Alex. It doesn't appear that
> IndexOrDocValuesQuery (at least in Solr 7.7.1) supports the PostFilter
> interface. I've tried various values for cost on each of the fq and it
> doesn't change the QTime.
>
> So, after digging around a bit even though
> {!cache=false}taggedTickets_ticketId:1000000241 only matches one and only
> one document in the collection that doesn't matter for the other two fq who
> continue to look over the index of the collection, correct?
>
> On Thu, Jul 9, 2020 at 4:24 PM Alexandre Rafalovitch <arafa...@gmail.com>
> wrote:
>
> > I _think_ it will run all 3 and then do index hopping. But if you know
> one
> > fq is super expensive, you could assign it a cost
> > Value over 100 will try to use PostFilter then and apply the query on top
> > of results from other queries.
> >
> >
> >
> >
> https://lucene.apache.org/solr/guide/8_4/common-query-parameters.html#cache-parameter
> >
> > Hope it helps,
> >     Alex.
> >
> > On Thu., Jul. 9, 2020, 2:05 p.m. Chris Dempsey, <cdal...@gmail.com>
> wrote:
> >
> > > Hi all! In a collection where we have ~54 million documents we've
> noticed
> > > running a query with the following:
> > >
> > > "fq":["{!cache=false}_class:taggedTickets",
> > >       "{!cache=false}taggedTickets_ticketId:1000000241",
> > >       "{!cache=false}companyId:22476"]
> > >
> > > when I debugQuery I see:
> > >
> > > "parsed_filter_queries":[
> > >   "{!cache=false}_class:taggedTickets",
> > >
>  "{!cache=false}IndexOrDocValuesQuery(taggedTickets_ticketId:[1000000241
> > > TO 1000000241])",
> > >   "{!cache=false}IndexOrDocValuesQuery(companyId:[22476 TO 22476])"
> > > ]
> > >
> > > runs in roughly ~450ms but if we remove `{!cache=false}companyId:22476`
> > it
> > > drops down to ~5ms (it's important to note that
> `taggedTickets_ticketId`
> > is
> > > globally unique).
> > >
> > > If we change the fqs to:
> > >
> > > "fq":["{!cache=false}_class:taggedTickets",
> > >       "{!cache=false}+companyId:22476
> > +taggedTickets_ticketId:1000000241"]
> > >
> > > when I debugQuery I see:
> > >
> > > "parsed_filter_queries":[
> > >    "{!cache=false}_class:taggedTickets",
> > >    "{!cache=false}+IndexOrDocValuesQuery(companyId:[22476 TO 22476])
> > > +IndexOrDocValuesQuery(taggedTickets_ticketId:[1000000241 TO
> > 1000000241])"
> > > ]
> > >
> > > we get the correct result back in ~5ms.
> > >
> > > My current thought is that in the slow scenario Solr is still running
> > > `{!cache=false}IndexOrDocValuesQuery(companyId:[22476
> > > TO 22476])` even though it "has the answer" from the first two fq.
> > >
> > > Am I off-base or misunderstanding how `fq` are processed?
> > >
> >
>

Re: Multiple fq vs combined fq performance

Reply via email to