Roger, yes, it does sound like the DIH is the most straight forward approach for you.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Roger Kjensrud <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Thursday, September 11, 2008 9:50:28 PM > Subject: index update and re-building > > Hi - I am a newbie to Solr and would like to get some advice on the best > strategy for updating the index in an environment where both content is > added and searches are executed 24/7. We would also like to have the > option of doing a full re-index on an as needed basis. > > > > I was initially looking into using the SolrJ client in conjunction with > Hibernate event listener and annotations on the entities. I would > process entities with special search annotations and then generate the > documents and send it to the Solr server using Solrj. But when looking > at how doing the full re-index - I felt it started to become too complex > having the solrserver ask for data from the app that would respond with > the documents based on some query and annotation processing. > > > > So I started to look at the DataImportHandler where having queries run > directly against the database - circumventing any integration with > Hibernate. Our requirement is to keep the index updated as close to > realtime as possible (max 5 min. lag). Looking at the DataImportHandler > we would need to trigger it with some type of scheduler - which seems > easy to set up on the master server. But I have seen comments on the > mailing lists saying that running an update every 5 min could be > excessive. Is that a problem? I assume it depends on how many updates > there are in the timeframe - we anticipate max 100 updates have occurred > in a 5 min span. > > > > Is the DataImportHandler the best approach in this case? Or are there > other approaches to consider? > > > > Thanks for your time, > > Roger Kjensrud