Re: fq versus q

Erick Erickson Wed, 24 Jun 2015 16:39:05 -0700

Tell us a bit more about your test setup. 1 or 2 tests
don't mean much. For instance, if the fq query has to
load the low-level caches from disk then the q-only
query is run and doesn't that could skew the results.
Or if somehow you're hitting the queryResultCache. Or....


Frankly I'd disable all my caches for running tests like
this, and be sure to mix-n-match the tests so I wasn't
getting bitten by caches.

And please tell us what the actual numbers are. 5-10X
doesn't mean much at all if it's 25ms .vs. 5 ms. It means
a lot (and something's very wrong) if it means
200ms .vs. 1,000ms.

Best,
Erick

On Wed, Jun 24, 2015 at 5:30 PM, Upayavira <u...@odoko.co.uk> wrote:
> Are you wanting to do no scoring at all, or just have a portion of the
> query not contribute to the score?
>
> If you don't want scoring at all, just sort by another field. If you
> don't have a field, I just tried "&sort=1 desc", and it worked! This
> should, if I'm right, pull documents out of the index in index order.
>
> Upayavira
>
> On Wed, Jun 24, 2015, at 08:26 PM, Shai Erera wrote:
>> Ah thanks. I see it was added in 5.1 - is there any other way prior to
>> that
>> (like 4.7)?
>>
>> if not, I guess the only option is to not use fq if we don't intend to
>> cache it, and on 5.1 use the ^= syntax.
>>
>> Shai
>>
>> On Wed, Jun 24, 2015 at 9:21 PM, Jack Krupansky
>> <jack.krupan...@gmail.com>
>> wrote:
>>
>> > Yonik added syntax to request a constant score query in Solr with the ^=
>> > operator.
>> >
>> > For example: +color:blue^=1 text:shoes
>> >
>> > See:
>> > https://issues.apache.org/jira/browse/SOLR-7218
>> >
>> > -- Jack Krupansky
>> >
>> > On Wed, Jun 24, 2015 at 1:41 PM, Shai Erera <ser...@gmail.com> wrote:
>> >
>> > > Thanks Shawn,
>> > >
>> > > What's Solr equivalence to ConstantScoreQuery? I.e., what if you want to
>> > > run a query that does not score, but only filter. The rationale behind
>> > > using a non-cached 'fq' was just that.
>> > >
>> > > Shai
>> > >
>> > > On Wed, Jun 24, 2015 at 4:29 PM, Shawn Heisey <apa...@elyograg.org>
>> > wrote:
>> > >
>> > > > On 6/24/2015 5:28 AM, Esther Goldbraich wrote:
>> > > > > We are comparing the performance of fq versus q for queries that are
>> > > > > actually filters and should not be cached.
>> > > > > In part of queries we see strange behavior where q performs 5-10x
>> > > better
>> > > > > than fq. The question is why?
>> > > > >
>> > > > > An example1:
>> > > > > q=maildate:{DATE1 to DATE2} COMPARED TO
>> > > fq={!cache=false}maildate:{DATE1
>> > > > > to DATE2}
>> > > > > sort=maildate_sort* desc
>> > > >
>> > > > <snip>
>> > > >
>> > > > > <field name="maildate" stored="true" indexed="true" type="tdate"/>
>> > > > > <field name="maildate_sort" stored="false" indexed="false"
>> > type="tdate"
>> > > > > docValues="true"/>
>> > > >
>> > > > For simplicity, I would probably just use one field for that, rather
>> > > > than a separate sort field.  The disk space required would probably be
>> > > > the same either way, but your interaction with the index will not be as
>> > > > complex.  There's nothing wrong with doing it the way you have, though.
>> > > >
>> > > > I'm not at all an expert, but I've been a member of this community for
>> > a
>> > > > long time.  Here's my guess about why your query is faster in the q
>> > > > parameter than a non-cached filter:
>> > > >
>> > > > The result of a standard query is the stored fields from the top N
>> > > > documents, where N is the value in the rows parameter.  The default for
>> > > > N is typically set to 10, and for most people will normally be 200 or
>> > > less.
>> > > >
>> > > > The result of a filter is very different -- it is a bitset of all the
>> > > > documents in your entire index, with binary 0 for documents that don't
>> > > > match the filter and binary 1 for documents that do match.
>> > > >
>> > > > If your index has 100 million documents, every single one of those 100
>> > > > million documents must be checked against the filter query to produce a
>> > > > filter bitset, but when it's in the q parameter, shortcuts can be taken
>> > > > which will get the top N results quickly.
>> > > >
>> > > > The filterCache levels the playing field when filters are re-used.  If
>> > a
>> > > > requested filter is already in the cache, it can be retrieved and
>> > > > applied to a result VERY quickly.
>> > > >
>> > > > You have turned off the caching for your filter.  I'm not sure why you
>> > > > did this, but you know your use case a lot better than I do.  If it
>> > were
>> > > > me, I would use filter queries and do everything possible to re-use the
>> > > > same filters, and I would cache them.
>> > > >
>> > > > Thanks,
>> > > > Shawn
>> > > >
>> > > >
>> > >
>> >

Re: fq versus q

Reply via email to