On 3/14/2011 9:38 PM, onlinespend...@gmail.com wrote:
But my main question is, how do I guarantee that data between my Cassandra
database and Solr index are consistent and up-to-date?

Our MySQL database has two unique indexes. One is a document ID, implemented in MySQL as an autoincrement integer and in Solr as a long. The other is what we call a tag id, implemented in MySQL as a varchar and Solr as a single lowercased token and serving as Solr's uniqueKey. We have an update trigger on the database that updates the document ID whenever the database document is updated.

We have a homegrown build system for Solr. In a nutshell, it keeps track of the newest document ID in the Solr Index. If the DIH delta-import fails, it doesn't update the stored ID, which means that on the next run, it will try and index those documents again. Changes to the entries in the database are automatically picked up because the document ID is newer, but the tag id doesn't change, so the document in Solr is overwritten.

Things are actually more complex than I've written, because our index is distributed. Hopefully it can give you some ideas for yours.

Shawn

Reply via email to