Hi all,

I am running a Solr application and I would need to implement a feature that requires faceting and filtering on a large list of IDs. The IDs are stored outside of Solr and is specific to the current logged on user. An example of this is the articles/tweets the user has read in the last few weeks. Note that the IDs here are the real document IDs and not Lucene internal docids.

So the question is what would be the best way to implement this in Solr? The list could be as large as a ten of thousands of IDs. The obvious way of rewriting Solr query to add the ID list as "facet.query" and "fq" doesn't seem to be the best way because: a) the query would be very long, and b) it would surely exceed that the default limit of 1024 Boolean clauses and I am sure the limit is there for a reason.

I had a similar problem before but back then I was using Lucene directly and the way I solved it is to use a MultiTermQuery to retrieve the internal docids from the ID list and then apply the resulting DocSet to counting and filtering. It was working reasonably for list of size ~10K, and with proper caching, it was working ok. My current application is very invested in Solr that going back to Lucene is not an option anymore.

All advice/suggestion are welcomed.

Thanks,
Tri

Reply via email to