Re: How does query on AND work

Per Steffensen Mon, 26 May 2014 04:49:55 -0700

Do not know if this is a special-case. I guess an AND-query where oneside hits 500-1000 and the other side hits billions is a special-case.But this way of carrying out the query might also be an optimization inless uneven cases.It does not require that the "lots of hits"-part of the query is arange-query, and it does not necessarily require that the field used inthis part is DocValue (you can go fetch the values from "slow" store).But I guess it has to be a very uneven case if this approach should befaster on a non-DocValue field.

I think this can be generalized. I think of it as something similar asbeing able to "hint" relational databases not to use an specific index.I do not know that much about Solr/Lucene query-syntax, but I believe"filter-queries" (fq) are kinda queries that will be AND'ed onto thereal query (q), and in order not to have to change the query-syntax toomuch (adding hits or something), I guess a first step for a featuredoing what I am doing here, could be introduce something similar to"filter-queries" - queries that will be carried out on the result of (q+ fqs) but looking a the values of the documents in that result insteadof intersecting with doc-sets found from index. Lets call it"post-query-value-filter"s (yes, we can definitely come up with abetter/shorter name)

1) q=no_dlng_doc_ind_sto:(<NO>) ANDtimestamp_dlng_doc_ind_sto:([<TIME_START> TO <TIME_END>])2)q=no_dlng_doc_ind_sto:(<NO>),fq=timestamp_dlng_doc_ind_sto:([<TIME_START> TO<TIME_END>])3)q=no_dlng_doc_ind_sto:(<NO>),post-query-value-filter=timestamp_dlng_doc_ind_sto:([<TIME_START>TO <TIME_END>])

1) and 2) both use index on both no_dlng_doc_ind_sto andtimestamp_dlng_doc_ind_sto. 3) uses only index on no_dlng_doc_ind_stoand does the time-interval filter part by fetching values (usingDocValue if possible) for timestamp_dlng_doc_ind_sto for each of thedocs found through the no_dlng_doc_ind_sto-index to see if this docshould really be included.

There are some things that I did not initially tell about actuallywanting to do a facet search etc. Well, here is the full story:http://solrlucene.blogspot.dk/2014/05/performance-of-and-queries-with-uneven.html


Regards, Per Steffensen

On 23/05/14 17:37, Toke Eskildsen wrote:

Per Steffensen [st...@designware.dk] wrote:

* It IS more efficient to just use the index for the
"no_dlng_doc_ind_sto"-part of the request to get doc-ids that match that
part and then fetch timestamp-doc-values for those doc-ids to filter out
the docs that does not match the "timestamp_dlng_doc_ind_sto"-part of
the query.

Thank you for the follow up. It sounds rather special-case though, with 
requirement of DocValues for the range-field. Do you think this can be 
generalized?

- Toke Eskildsen

Re: How does query on AND work

Reply via email to