Can you elaborate on "running SOLR-20 with a hibernate-solr auto link"? You mean you listen to Hibernate events and use them to keep the index served by Solr in sync with the DB?
I built a HibernateEventWatcher modeled after the compass framework that automatically gets notified on insert/update/delete. Anything saved that the "SolrDocumentBuilder" knows what to do with gets sent to solr automatically. This way the solr index stays in sync with the SQL index without any explict work. A first pass at this is here: http://solrstuff.org/svn/solrj-hibernate/src/org/apache/solr/client/solrj/hibernate/SolrSync.java lots of that changed in the production code... when SOLR-20 stabalizes, I'll put the good bits back in and hopefully post it in a 'contrib' section.
Also, "pooling for 30 seconds on the client side..." - are you referring to keeping data cached in the Solr client for 30 seconds and every 30 second sending it to Solr for indexing?
We are currently running with a single (no replication or load balancing) solr server. With multiple webapps pointing to it. Rather then manage commit timing on the client-side, we have autoCommit set to 1second. The multiple webapps can't start overlapping commits. Since commit flushes the caches and forces you to reopen the searchers, we want to do it as little as possible. This is required for instant access to uploaded images, but not required for stuff that can take a bit longer... for the other stuff the client keeps a queue (I called it pooling) of stuff to send. Every 30 secs, it sends it to solr in a bulk update. That time could be longer, I found it was the minimum time to avoid multiple unnecessary commits for our usage patterns.
If so, why not index continuously, either in real-time or in some background thread that feeds off of a "to index" queue?
yes, we have a background thread that queues changes and sends them all at once. ryan