I don't know the answer to feasibilty either, but I'll just point out
that boolean "OR" corresponds to set "union", not set "intersection".
So I think you probably mean a 'union' type of filter query;
'intersection' does not seem to describe what you are describing;
ordinary 'fq' values are 'intersected' already to restrict the result
set, no?
So, anyhow, the basic goal, if I understand it right, is not to provide
any additional semantics, but to allow individual clauses in an 'fq'
"OR" to be cached and looked up in the filter cache individually.
Perhaps someone (not me) who understands the Solr architecture better
might also have another suggestion for how to get to that goal, other
than the specific thing you suggested. I do not know, sorry.
Hmm, but I start thinking, what about a general purpose mechanism to
identify a sub-clause that should be fetched/retrieved from the filter
cache. I don't _think_ current nested queries will do that:
fq=_query_:"foo:bar" OR _query_:"foo:baz"
That's legal now (and doesn't accomplish much) -- but what if the
individual subquery components could consult the filter cache
seperately? I don't know if nested query is the right way to do that or
not, but I'm thinking some mechanism where you could arbitrarily
identify clauses that should be filter cached independently?
Jonathan
On 7/27/2011 4:00 PM, Shawn Heisey wrote:
I've been looking at the slow queries our Solr installation is
receiving. They are dominated by queries with a simple q parameter
(often *:* for all docs) and a VERY complicated fq parameter. The
filter query is built by going through a set of rules for the user and
putting together each rule's query clause separated by OR -- we can't
easily break it into multiple filters.
In addition to causing queries themselves to run slowly, this causes
large autowarm times for our filterCache -- my filterCache
autowarmCount is tiny (4), but it sometimes takes 30 seconds to warm.
I've seen a number of requests here for the ability to have multiple
fq parameters ORed together. This is probably possible, but in the
interests of compatibility between versions, very impractical. What
if a new parameter was introduced? It could be named fqi, for filter
query intersection. To figure out the final bitset for multiple fq
and fqi parameters, it would use this kind of logic:
fq AND fq AND fq AND (fqi OR fqi OR fqi)
This would let us break our filters into manageable pieces that can
efficiently populate the filterCache, and they would autowarm quickly.
Is the filter design in Solr separated cleanly enough to make this at
all reasonable? I'm not a Java developer, so I'd have a tough time
implementing it myself. When I have a free moment I will take a look
at the code anyway. I'm trying to teach myself Java.
Thanks,
Shawn