Greetings Comrades. There were numerous requests and considerations on using Solr as both search engine and NoSQL store at the same time. While being an excellent tool as a search engine, Solr is looking not so good when it comes to storing documents and various stored fields, especially with big amount of data. Index quickly grows to unmanageable sizes. Then, there is ever-coming PITA problem with partial document update: due to the nature of Lucene/Solr index, documents can't be updated, they always need to be deleted & inserted. All in all, Solr desperately need a tight integration with some document storage, offloading stored fields of the document and transactionally coupled with search index itself, so that stored field are at all times synced with the other parts of the index (terms, doc values etc.).
Unfortunately, unlike Lucene, Solr does not offer full set of distribiuted transaction API commands, thus seriously complicating this matter. Luckily, with advent of Solr 4.0 now we have abilitu to create not only the custom Directory, but also completely tweak the index structure any way we like. Based on this new feature I've created my custom Directory + custom codec, integrating Solr with Oracle NoSQL key-value store. My codec is based on Solr 4.10.1 API and Oracle NoSQL 1.2.1.8 Community Edition. Fields in NoSQL storage are persisted using primary key, derived from the document fields. The codec relays stored fields to the NOSQL store while keeping all other index components in usual file-based storage layout. The codec has been made with SolrCloud and NoSQL own fault tolerance usage in mind, hence it's tried to ignore wrote commands to NoSQL storage if index is being created at replica node which is not a Solr shard leader currently. First stable version of the codec transparently supports full index life cycle, includung segment creation, merging and deletion. Source code and readme, detaling usage instructions for the codec can be found at github: https://github.com/andrey42/onsqlcodec I assume, there might be other developers, trying to solve similar problems, so I'd be interested to hear about similar attempts & issues encountered while trying to implement such an integration between Solr and other NoSQL databases.