Can you elaborate on "running SOLR-20 with a hibernate-solr auto link"?  You 
mean you listen to Hibernate events and use them to keep the index served by Solr in sync 
with the DB?


I built a HibernateEventWatcher modeled after the compass framework
that automatically gets notified on insert/update/delete.  Anything
saved that the "SolrDocumentBuilder" knows what to do with gets sent
to solr automatically.  This way the solr index stays in sync with the
SQL index without any explict work.

A first pass at this is here:
http://solrstuff.org/svn/solrj-hibernate/src/org/apache/solr/client/solrj/hibernate/SolrSync.java

lots of that changed in the production code... when SOLR-20
stabalizes, I'll put the good bits back in and hopefully post it in a
'contrib' section.


Also, "pooling for 30 seconds on the client side..." - are you referring to 
keeping data cached in the Solr client for 30 seconds and every 30 second sending it to 
Solr for indexing?


We are currently running with a single (no replication or load
balancing) solr server.  With multiple webapps pointing to it.  Rather
then manage commit timing on the client-side, we have autoCommit set
to 1second.  The multiple webapps can't start overlapping commits.

Since commit flushes the caches and forces you to reopen the
searchers, we want to do it as little as possible.  This is required
for instant access to uploaded images, but not required for stuff that
can take a bit longer...  for the other stuff the client keeps a queue
(I called it pooling) of stuff to send.  Every 30 secs, it sends it to
solr in a bulk update.  That time could be longer, I found it was the
minimum time to avoid multiple unnecessary commits for our usage
patterns.


If so, why not index continuously, either in real-time or in some background thread that 
feeds off of a "to index" queue?


yes, we have a background thread that queues changes and sends them all at once.


ryan

Reply via email to