Hello Mark some of these questions has been touched recently, see below.
On Wed, Dec 19, 2012 at 10:50 PM, Mark <static.void....@gmail.com> wrote: > We're currently running Solr 3.5 and our indexing process works as follows: > > ..... > > I also have the following questions. > Does DIH work with Solr Cloud? > Yes. it seems like it does. try to search jira for something like https://issues.apache.org/jira/browse/SOLR-4112 Can Solr Cloud utilize the whole cluster to index in parallel to remove the > burden of one machine from performing that task. If you run DIH at one of cluster nodes, it will distribute docs across shards, but it's done one by one in a sequence, i.e. indexing is distributed but not concurrent . see http://web.archiveorange.com/archive/v/AAfXfvu1WJopdWvFGBFL#gLyzRJlUi7zW86C > If so, how is it balanced across all nodes? Can this work with DIH > I'm not really getting this question, but it works in SolrCloud as usual. DIH invokes UpdateProcessors chain, DistributedUpdateProcessor sends every doc to the proper shard. > When we decide to run a full-import how can we due this and not affect our > existing cluster since there is no real master/slave and obviously no > staging "master"? > if you disable auto commit, until DIH commits explicitly no one from slaves flip their index. My feeling that for fullimport scenario SolrCloud is not really efficient - it's purposed for NRT. http://web.archiveorange.com/archive/v/AAfXfleaxcoo9y8JuaFm#zCBXziMgfela6B5 > > Thanks in advance! > > - M Looking forward for your architecture findings. -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics <http://www.griddynamics.com> <mkhlud...@griddynamics.com>