Re: Efficient boolean query

Ofer Fort Wed, 02 Mar 2011 12:16:56 -0800

I'm guessing what i was describing is a short-circuit evaluation and i see
that lucene doesn't have it:
http://lucene.472066.n3.nabble.com/Short-circuit-in-query-td738551.html


Still would love to hear any suggestions for my type of query

ofer

On Wed, Mar 2, 2011 at 8:58 PM, Ofer Fort <o...@tra.cx> wrote:

> Thanks,
> But each query tries to see if there is something new since the last result
> that was found, so rounding things will return the same documents over  and
> over again, till we reach to the next rounded point.
>
> Could i use the document id somehow?  or something else that's bigger than
> my last search?
>
> And even it was a simple term query, on the lucene side of things, why
> would it try to fetch ALL the terms if one of the required ones resulted in
> an empty set?
>
> thanks for your help, specifically on this matter and in general, to the
> search community :-)
>
>
> On Wed, Mar 2, 2011 at 8:35 PM, Yonik Seeley 
> <yo...@lucidimagination.com>wrote:
>
>> One way to speed things up would be to reduce the resolution on
>> timestamps that you index.
>> Another way would be to decrease the precisionStep on the tdate field
>> type (bigger index, but faster range queries)
>> Yet another way is to use "fq" filters that can be reused many times.
>>
>> One way to increase fq reuse is to round.
>> This rounds up to the nearest hour... assumes 2011-02-01T00:00:00Z is
>> the same across many queries.
>> fq=timestamp:[2011-02-01T00:00:00Z TO NOW/HOUR+1HOUR]
>>
>> Another way is to split the filter into two parts - a large part that
>> doesn't change much + a small part that does.
>> Again this assumes that the first endpoint is reused across many queries.
>> fq=timestamp:[2011-02-01T00:00:00Z TO
>> NOW/HOUR+1HOUR]&fq=timestamp:[NOW/HOUR TO NOW]
>>
>> If the first endpoint is *not* reused across many queries, then you
>> can still use the same strategy as above by adding another small "fq"
>> for the lower endpoint.
>>
>> -Yonik
>> http://lucidimagination.com
>>
>>
>>
>> On Wed, Mar 2, 2011 at 1:11 PM, Ofer Fort <o...@tra.cx> wrote:
>> > you are correct that my query is a tange one, probably should have
>> mentioned
>> > it in the first post.
>> > this is the debug data:
>> >
>> > <?xml version="1.0" encoding="UTF-8"?>
>> > <response>
>> >
>> > <lst name="responseHeader">
>> >  <int name="status">0</int>
>> >  <int name="QTime">4173</int>
>> >  <lst name="params">
>> >  <str name="debugQuery">on</str>
>> >  <str name="indent">on</str>
>> >
>> >  <str name="start">0</str>
>> >  <str name="q">timestamp:[2011-02-01T00:00:00Z TO NOW] AND oferiko</str>
>> >  <str name="version">2.2</str>
>> >  <str name="rows">10</str>
>> >  </lst>
>> > </lst>
>> > <result name="response" numFound="0" start="0"/>
>> > <lst name="debug">
>> >
>> >  <str name="rawquerystring">timestamp:[2011-02-01T00:00:00Z TO NOW] AND
>> > oferiko</str>
>> >  <str name="querystring">timestamp:[2011-02-01T00:00:00Z TO NOW] AND
>> > oferiko</str>
>> >  <str name="parsedquery">+timestamp:[1296518400000 TO 1299069584823]
>> > +contents:oferiko</str>
>> >  <str name="parsedquery_toString">+timestamp:[1296518400000 TO
>> > 1299069584823] +contents:oferiko</str>
>> >  <lst name="explain"/>
>> >  <str name="QParser">LuceneQParser</str>
>> >
>> >  <lst name="timing">
>> >  <double name="time">4171.0</double>
>> >  <lst name="prepare">
>> >    <double name="time">0.0</double>
>> >    <lst name="org.apache.solr.handler.component.QueryComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >
>> >    <lst name="org.apache.solr.handler.component.FacetComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.HighlightComponent">
>> >     <double name="time">0.0</double>
>> >
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.StatsComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.DebugComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >  </lst>
>> >
>> >  <lst name="process">
>> >    <double name="time">4171.0</double>
>> >    <lst name="org.apache.solr.handler.component.QueryComponent">
>> >     <double name="time">4171.0</double>
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.FacetComponent">
>> >     <double name="time">0.0</double>
>> >
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.MoreLikeThisComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.HighlightComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.StatsComponent">
>> >
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >    <lst name="org.apache.solr.handler.component.DebugComponent">
>> >     <double name="time">0.0</double>
>> >    </lst>
>> >  </lst>
>> >  </lst>
>> > </lst>
>> >
>> > </response>
>> >
>> >
>> > On Wed, Mar 2, 2011 at 7:48 PM, Yonik Seeley <
>> yo...@lucidimagination.com>wrote:
>> >
>> >> On Wed, Mar 2, 2011 at 12:11 PM, Ofer Fort <ofer...@gmail.com> wrote:
>> >> > Hey all,
>> >> > I have an index with a lot of documents with the term X and no
>> documents
>> >> > with the term Y.
>> >> > If i query for X it take a few seconds and returns the results.
>> >> > If I query for Y it takes a millisecond and returns an empty set.
>> >> > If i query for Y AND X it takes a few seconds and returns an empty
>> set.
>> >>
>> >> This depends on the specifics of what X is.   Some query types must
>> >> generate all hits first internally - an example is a multi-term query
>> >> (like numeric range query, etc) that matches many terms.
>> >>
>> >> Can you show the generated query (i.e. add debugQuery=true to the
>> request)?
>> >>
>> >> -Yonik
>> >> http://lucidimagination.com
>> >>
>> >
>>
>
>

Re: Efficient boolean query

Reply via email to