: Does anyone have more experience doing this kind of stuff and whants to share?
My advice: don't. I work with (or work with people who work with) about two dozen Solr indexes -- we don't attempt to update a single one of them in any sort of transactional way. Some of them are updated "real time" (ie: as soon as the authoritative DB is updated by some code, the same code updates the Solr index; Some of them are updated in batch (ie: once every N minutes code checks a log of all logical objects modified/deleted from the DB and sends the adds/delets to Solr; And some are only ever rebuilt from scrath every N hours (because the data in them isn't very time sensative and rebuilding from scratch is easier then dealing with incremental or batch updates. But as i said: we never attempt to be transactional about it, for a few reasons: 1) why should it be part of the transaction? a Solr index is a denormalized/inverted index of data .. why should a tool (or any other process) be prevented from writting to an authoritative data store just becuase a non authoritative copy of that data can't be updated? ... if you used MySQL with replication, would you really want to block all writes to the master just because there's a glitch in replicating to a slave? 2) why worry about it? It's relaly a non issue. If an add or delete fails it's usually either developer error (ie: the code generating your add statements thinks there's a field that doesn't exist), a transient timeout (maybe because of a commit in progress) or network glitch (have the client retry once or twice), or in very rare instances the whole Solr index was completely jacked (either from disk failure, or OOM due to a huge spike in load) and we want to revert to a backup of the index in the shortterm and rebuild the index from scratch to play it safe. 3) why limit yourself? you're going to want the ability to trigger arbitrary indexing of your data objects at anytime -- if for no other reason then so when you decide to add a field to your index you can reindex them all -- so why make your index updating code inherently tied to your DB updating code? As for your specific question along the lines of "why can't we do a mix of <add>s and <delete>s all as part of one update message?" the answer is "because no one ever wrote any code to parse messages like that." BUT! ... that's not the question you really want to ask. the question you relaly want to ask is: "*IF* someone wrote code to allow a mix of <add>s and <delete>s all as part of one update message, would it solve my problem of wanting to be able to modify my solr index transactionally?" and the answer is "No." Even if Solr accepted update messages that looked like this... <update> <delete><id>42</id></delete> <add><field name="id">7</field><field name="a">bb</field></add> <add><field name="id">666</field><field name="a">cccc</field></add> </update> ...the low level lucene calls that it would be doing internall still aren't transactional, so the first "delete" and "add" might succeed, but if there was then some kind of internal error, or a timeout because the first add took a while (maybe it triggered a segment merge) and the second add didn't happen -- the first two commands would have still been executed, and there would be no way to "rollback". In a nutshell: you would be no better off then if your client code has sent all three as seperate update messages. -Hoss