Do you have to have your data in Solr as soon as its added to the DB? Probably not. What if somebody manually changes the DB? It will be out of sync with your DB. We see similar situations pretty frequently and our solution is a standalone DB Indexing application that knows how to do incremental indexing, detect deleted rows/document as well as updates. So I'd suggest you think about that approach.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- From: Leonardo Santagada <[EMAIL PROTECTED]> To: Norberto Meijome <[EMAIL PROTECTED]> Cc: solr-user@lucene.apache.org Sent: Saturday, January 12, 2008 10:16:39 AM Subject: Transactions and Solr Was: Re: Delte by multiple id problem On 12/01/2008, at 11:24, Norberto Meijome wrote: > On Fri, 11 Jan 2008 00:43:19 -0200 > Leonardo Santagada <[EMAIL PROTECTED]> wrote: > >> No, actually my problem is that the solr index is mirroring data on a >> database (a Zope app to be more acurate) so it would be better if I >> could send the whole transaction together so I don't have to keep it >> on separate files... wich I have to do so I can not send anything if >> the transaction is aborted (I can't abort a solr add right?). >> >> Maybe I should explain more, but I think this is pretty comon to >> anyone trying to keep database transactions and a solr index in sync, >> as solr doesn't support two phase commit or anything like that. > > Hola Leonardo, > I haven't have to do this, but I am starting to design something > along these lines. > > if you execut your 'add' and 'deletes' from a stored proc, inside a > transaction, you can simply have an extra table with Solr doc ids > and the action to perform (add / delete). > eg, > exec(delete_from_my_db('xyz') -> > being transaction > {do here all your DB work} > {add to tblSolrWork the ID to delete} > end transaction > Hence , If the transaction fails, those records will never actually > exist. Not that simple, for example, another add with the same unique key should remove the key from the delete, and then store the whole data twice so you know what to send to solr. Also you have to save a serial number of the transaction so you add documents in the right order and do the deletes also in order. And having one table that manages this in the same relational database could mean a big drop in performance, as everything you do on your db would lock, write and read from a single or a couple of tables, and this makes your life a living hell also :). What I am doing on Zope is firing some events when new documents are added, updated or removed, and then I join the transaction with my transaction manager wich orders the adds to solr and already saves a xml file to be sent to solr. The problems with this are the ones mentioned, it would be simpler if the same file could send all types of commands to solr (add and delete are the ones I am using. > Whether and how you could do this in Zope, I have no idea, but if > you solve it it would be great if you could share it here . > > You could also make use of triggers (on insert / update and > onDelete triggers), but I suppose that is a bit more DB dependent > than plain SP work - though it may be simpler to implement than > changing all your code to call the SP instead of direct SQL cmds... Probably, but still would hit performance really hard on a relational database that have a lot more than documents on it I think. Does anyone have more experience doing this kind of stuff and whants to share? -- Leonardo Santagada