On 10/8/2012 4:09 PM, kevinlieb wrote:
Thanks for all the replies.
I oversimplified the problem for the purposes of making my post small and
concise. I am really trying to find the counts of documents by a list of 10
different authors that match those keywords. Of course on looking up a
single author there is no reason to do a facet query. To be clearer:
Find all documents that contain the word "dude" or "thedude" or
"anotherdude" and count how many of these were written by "eldudearino" and
"zeedudearino" and "adudearino" and "beedudearino"
I tried facet.query as well as facet.method=fc and neither really helped.
We are constantly adding documents to the solr index and committing, every
few seconds, which is probably why this is not working well.
Seems we need to re-architect the way we are doing this...
I would definitely consider increasing the amount of time between
commits. You can add documents at whatever interval you want, but if
you only do commits every minute or two, your caches will be much more
useful.
Your time slice filter query (NOW-5MINUTES) will never be cached,
because NOW is measured in milliseconds and will therefore be different
for every query. You might consider doing NOW/MINUTE-5MINUTES instead
.. or even [NOW/MINUTE-5MINUTES TO *] so that you actually are dealing
with a range. For the space of that minute (at least until the cache
gets invalidated by a commit), the filter cache entry will be valid.
Some general questions that may matter: How big are all your index
directories on this server, how much RAM is in the server, and how much
RAM are you giving to Java? I'm also curious how big your Solr caches
are, what the autowarm counts are, and how long it is taking for your
caches to warm up after each commit. You can get the warm times from
the cache statistics in the admin interface.
Thanks,
Shawn