Hi, We have decided to migrate from Lucene 3.x to latest Solr. A lot of architectural discussions are going on. There are two possible approaches.
Please note that our customer-facing app (or any client) and Search are hosted on different machines. *1) Have a clean architecture* - Solr takes care of customized search only. - We certainly have to override some filtering, scoring,etc. - There will be an intermediary search-app that - receives queries - does a/b testing assignments, and other non-search stuff. - does query expansion / rewriting (to avoid every Solr shard doing that) - transforms query into Solr syntax and uses Solr's http API to consume it. - returns the response to customer-facing app or whatever the client is. The problem with this approach is the additional layer and the latency between search-app and solr. The client of search has to make an API call, across the network, to the intermediary search-app which in turns makes another Http API call to Solr. *2) Customize Solr to the full extent* - Do all the crazy stuff within Solr. - We can literally create a new url and register a handler class to process that. With some limitations, we should be able to do almost anything. The benefit of this approach is that it obviates the additional layer and the latency. However, I see a lot of long-term problems like hard to upgrade Solr's version, Dev flexibility (usage of Spring, Hib, etc.). How about a distributed search? Where do above approaches stand? I understand that this is a subjective question. It'd be helpful if you could share your thoughts and experiences. Thanks.