Then I predict they will continue to grow and whatever limit you put on maxBooleanClauses will be exceeded later. And so on, so I really think you need to re-think your model.
One approach: 1> change your model so your users are assigned to a fixed number of groups. Then index group tokens with each document. You can index as many tokens in the _document_ as you want. Then your process looks like this: 1> user signs on, you go query the system-of-record for her groups. 2> each query from that user gets a filter query with their group tokens. The problem with this approach is if groups change, you have to re-index the affected documents. But it is fast. Essentially you exchange up-front work when indexing for _much_ less work at query time. Second approach: Use post-filters, see: http://lucidworks.com/blog/advanced-filter-caching-in-solr/ These were first created for the ACL problem. Best, Erick On Tue, Oct 14, 2014 at 4:31 AM, ankit gupta <ankitgupta...@gmail.com> wrote: > Thanks Erick for responding. > > We have assigned 4GB memory for SOLR server and at high load where queries > are having more than 10K boolean clauses, combination of cache and high > boolean clauses are causing system to break. The system was working fine > for last 8 months but ofcourse the boolean clauses has increased over time > which I believe has caused the system to break and thats why I am looking > for some numbers which can tell me how much memory will solr take to > process say 1K boolean clauses in the query. > > The requirement at our end does required such huge number of boolean > clauses. We need to present the search results to which user is entitled > to. > > The entitlement is logic is dependent upon multiple packages. for example , > user has entitlement to package A and B so we need to present search > results in case the results have tag of package A or Package B. > > These packages have grown over time and seems to be causing issues. > > Thanks, > Ankit > > > > On Mon, Oct 13, 2014 at 5:53 PM, Erick Erickson <erickerick...@gmail.com> > wrote: > >> Of course there will be performance and memory changes. The only >> real question is whether your situation can tolerate them. The whole >> point of maxBooleanClauses is exactly that going above that limit >> should be a conscious decision because it has implications for >> both memory and performance >> >> That said, that limit was put in there quite some time ago and >> things are much faster now. I've seen installation where this limit is >> raised over 10K. >> >> Are you sure this is the best approach though? Could joins >> work here? Or reranking? (this last is doubtful, but...). >> >> This may well be an XY problem, you haven't explained _why_ >> you need so many conditions which might enable other >> suggestions. >> >> Best, >> Erick >> >> On Mon, Oct 13, 2014 at 9:10 AM, ankit gupta <ankitgupta...@gmail.com> >> wrote: >> > hi, >> > >> > Can we quantify the impact on SOLR memory usage/performance if we >> increase >> > the boolean clause. I am currently using lot of OR clauses in the query >> > (close to 10K) and can see heap size growing. >> > >> > Thanks, >> > Ankit >>